cloudspannerecosystem / autoscaler

Automatically scale the capacity of your Spanner instances based on their utilization.
Apache License 2.0
86 stars 33 forks source link

Bug: Serverless Poller Memory Leak #376

Closed alexlo03 closed 2 months ago

alexlo03 commented 2 months ago

Version we are on: v2.1.0

Error Message

Function execution took 2021 ms, finished with status: 'crash'

Memory limit of 512 MiB exceeded with 515 MiB used. Consider increasing the memory limit, see https://cloud.google.com/functions/docs/configuring/memory

We are getting OOM crashes from the Serverless Poller - I bumped memory from 256MB to 512MB and it reduced the incidence, but it looks like a memory leak. Incidence of OOM after changing memory size:

Screenshot 2024-08-06 at 14 54 41

We are running this for 100s of instances. We never get OOM from the scaler component.

I took a look through the code but didn't see anything obvious.

henrybell commented 2 months ago

Hi, thanks for raising this -- we've not heard any similar reports from other users, though with the number of instances being autoscaled in this case, the behaviour may well differ in ways we've not seen before. Would it be possible for you to share your autoscaler config via email, a shared doc, or another method of your choosing, so that we can take a look? This could be with any sensitive information redacted of course. Thanks.

alexlo03 commented 2 months ago

@henrybell yep can share - can you email redacted

henrybell commented 2 months ago

Thanks @alexlo03 -- will do.