GoogleCloudPlatform / ai-on-gke

AI on GKE is a collection of examples, best-practices, and prebuilt solutions to help build, deploy, and scale AI Platforms on Google Kubernetes Engine
Apache License 2.0
213 stars 158 forks source link

Bump rag frontend resource requests #605

Closed artemvmin closed 4 months ago

artemvmin commented 4 months ago

Frontend sometimes gets OOM-killed. Set guaranteed QOS.

artemvmin commented 4 months ago

/gcbrun

blackzlq commented 4 months ago

The dangerous part is https://github.com/GoogleCloudPlatform/ai-on-gke/blob/main/applications/rag/frontend/container/cloud_sql/cloud_sql.py#L58 we select all. LOL, I think Julie's change will reduce this issue.