Open timspainNERSC opened 1 year ago
A 128x128 double array has a size of 128 kiB, or 8 arrays per MiB. This implies roughly 16 arrays per day of run time, or one array every 90 minutes, or every 9 timesteps. This implies that it is not an array that is created an leaked every timestep, but either a smaller amount of data every timestep or an array with lower frequency.
valgrind is a tool that can be used for this (see leak-check
argument in the Quick start guide).
And Valgrind doesn't work on modern MacOS :(
Oh noes :( I will have another think...
I've been using std::cerr and Mach task_info.
Based on this stackexchange post
Most of the memory is leaked in library code.
Half of the apparent leaking occurred when doing whole-array mathematics on ModelArrays. The internet suggests that Eigen is a bit lax at cleaning up some temporary arrays, which the ModelArray maths certainly used. Changing the SlabOcean update to be a per-element calculation remove the leak there, at the cost of some performance.
The majority of the remaining leak occurs when reading the netCDF forcing files. Again the leak is occurring during library calls, so my code is not directly responsible. Refactoring would seem to be impossible at this point, but I can at least reduce the number of calls by only reading the forcing files when I know the values in ERA or TOPAZ change (one per hour and once per day respectively).
It has also been suggested by @draenog and @a-smith-github that the memory is not truly leaking, but is just released memory allocations that haven't yet been cleaned up by the OS.
The TOPAZ-ERA5 year-long thermodynamics only run has a memory usage of roughly 2 MB per day (18.5 MB–70-some MB over the course of a 31 day run). This implies something is leaking memory.