openclimatefix / metoffice_ec2

Subset Met Office MOGREPS-UK and UKV on AWS EC2
MIT License
11 stars 2 forks source link

Script uses increasingly large amounts of memory on EC2 #22

Closed JackKelly closed 4 years ago

flowirtz commented 4 years ago

This seems to have gotten better a bit, but the underlying problem still seems to be there.

image

This is the memory usage of the container for the last three days. While the memory usage does grow way slower, it still is growing until it crashes at some point. Note that this is on v1.2.0 though, not latest.

I'll try to push the new version today and see whether anything changes.

flowirtz commented 4 years ago

So it seems like we still see that issue on the latest version:

e089abeca5cc4c8684d29a0d8919a691: main: OutOfMemoryError: Container killed due to memory usage f75d195e627843b58031f42f74a28f7a: main: OutOfMemoryError: Container killed due to memory usage

JackKelly commented 4 years ago

Hmm... are we certain Python is using all the memory here? It's not, say, disk caching or something weird like that?

The only objects which persist in the ec2.py script are _LOG and a few constants.

If it is Python then a simple hack might be to change the ec2.py script to terminate (and hence flush its memory usage) when the SQS queue is empty; and then call ec2.py from a bash script which calls ec2.py from an endless while loop?!?

JackKelly commented 4 years ago

Just had a quick chat with Flo:

flowirtz commented 4 years ago

don't see this anymore, closing.