GoogleCloudPlatform / ai-on-gke

AI on GKE is a collection of examples, best-practices, and prebuilt solutions to help build, deploy, and scale AI Platforms on Google Kubernetes Engine
Apache License 2.0
211 stars 154 forks source link

RAG application leaks DB connection objects #725

Open bhlim opened 2 months ago

bhlim commented 2 months ago

With IAP enabled for the RAG application, the rag-frontend workload runs out of memory after a couple of days and stops serving. I traced this to a memory leak when serving http requests. With IAP, the K8s ingress server pings the http server continuously and exposes the leak more rapidly.

The culprit is at this line in frontend/container/cloud_sql.py.

This can be fixed by moving connector = Connector() into the if statement:

global db if db is None: connector = Connector() db = init_connection_pool(connector)