Open bgregoinductiva opened 6 months ago
Yes, based on what I have learnt so far, your instance was terminated because it accessed more memory that its defined limit.
Even though this says that Cloud Run will return a 500. In my testing I was able to prove the it actually returns a 503. Their documentation leaves a lot to be desired.
Hope this helps.
We have the same issue, only at 40% memory usage at 99 percentile.
Update: we isolated the issue to only HTTP/2. HTTP/1 seems to be fine.
Update: we isolated the issue to only HTTP/2. HTTP/1 seems to be fine.
Is the HTTP/1 traffic encrypted? There seems to be an asyncio memory leak with SSL
Update: we isolated the issue to only HTTP/2. HTTP/1 seems to be fine.
Is the HTTP/1 traffic encrypted? There seems to be an asyncio memory leak with SSL
Cloudrun terminates TLS. https://cloud.google.com/run/docs/container-contract#tls
Also, I hate to admit this in public, but I wasn't closing SQL connections in the health check endpoint so that was leaking file descriptors. This was causing our Cloud Run containers to crash without log events returning a 503 from the Cloud Run LB.
So another thing to check would be your file descriptor count.
I have a FastAPI project deployed in Cloud Run using the hypercorn server. I'm using Uvloop as the event loop and leaving the other configurations with default values:
hypercorn app.main:app --bind 0.0.0.0:80 --worker-class uvloop
Here are the Cloud Run configurations:
When I get a peak of concurrent requests during integration testing, about 30, I usually get a 503, and then a new instance is started.
Has anyone faced a similar problem before?
Thanks in advance.