ratt-ru / CubiCal

A fast radio interferometric calibration suite.
GNU General Public License v2.0
18 stars 13 forks source link

Memory growth #359

Closed JSKenyon closed 4 years ago

JSKenyon commented 4 years ago

While attempting to generate some sort of empirical wisdom relating to memory footprint, my experiments showed the following:

mprof

This is the output of a memory_profiler run with --dist-ncpu 3 run on my laptop. Ignore child 3 and 4. The black curve is the overall memory usage. Child 0 is the I/O process, and it seems well behaved. However I cannot fathom the ramping in child 1 and 2. In my mind, they definitely shouldn't grow with time. This suggests that something is stored between tiles - I have tried to find it but to no avail. @o-smirnov have you noticed this behaviour? Or do you perhaps have some intuition regarding its origin? Note that this growth doesn't occur in the serial case. Additionally it has something to do with the number of chunks processed by the worker. If we look at the same experiment as above but with --dist-ncpu 5 we see that the memory footprint of each worker is slightly less at the end (though the growth is still apparent).

mprof2

My feeling is that the memory usage of the workers should have a heart-beat pattern, increasing when they allocate their temporary arrays and decreasing as they finish with a chunk.

o-smirnov commented 4 years ago

My feeling is that the memory usage of the workers should have a heart-beat pattern,

I recall a certain PhD thesis showing just that. ;)

So it must've been broken since. Maybe I need to force a garbage collection in the solver worker?

JSKenyon commented 4 years ago

About 30 seconds after I had a stroke of inspiration and forced a garbage collector run at the end of each solver call. This is what is looks like with --dist-ncpu 12 without manual GC:

withoutgc

And with manual GC:

withgc

JSKenyon commented 4 years ago

This frankly silly factor of 2 improvement will likely help people struggling with memory problems. Will get it into master ASAP.

JSKenyon commented 4 years ago

Closed via #360.