GoogleCloudPlatform / ai-on-gke

AI on GKE is a collection of examples, best-practices, and prebuilt solutions to help build, deploy, and scale AI Platforms on Google Kubernetes Engine
Apache License 2.0
194 stars 143 forks source link

Guide for Maxtext Prometheus Metrics #668

Closed Bslabe123 closed 1 month ago

Bslabe123 commented 2 months ago

Note: this guide requires new releases on MaxText and Jetstream repos to work, otherwise: modify the dockerfiles for maxtext-server and jetstream-http-server so that they use the latest masters of Jetsteam and MaxText repos since the demoed changes havent been picked up in a release in either repo. Also note MaxText depends on jetstream so a change will need to be made there too

liurupeng commented 2 months ago

would be good to add a yaml for HPA to use that metrics for autoscaling