Closed esheldon closed 2 years ago
The test I first quoted above was running on current master but the jobs that got killed were using the version in the DM stack, so this seems to have been around a while.
It is also worth noting that running other codes on the same images I see flat memory usage
Code to reproduce
https://gist.github.com/esheldon/2f5e32155e37ddb201aa623b57e49328
Interesting, I wonder if this is related to autograd. We know that it has huge memory issues and I wouldn't be surprised if there is a memory leak in there somewhere.
The other obvious culprits are the cached variables that occur during blending, especially the monotonicity operator. We should probably add a clear
option to scarlet.cache.Cache
to allow the user to clear the cache after each blend. @esheldon would you mind adding the line
scarlet.cache.Cache._cache = {}
to your code after you have finished with each blend? If the primary problem is a caching issue then this should fix that, otherwise it's likely autograd, in which case I hope to have a new module in the next few weeks that doesn't use autograd and gives a sizeable savings in performance and memory usage.
It might even be advisable to add a clear_cache
parameter that defaults to True
in blend.fit
to automatically clear the cache once the fit has finished to save the user the hassle. Making it an argument will allow advanced users to save the cache if they are doing multiple iterations on the same blend, but I imagine that most users won't want the cached variables hanging around after they've finished optimizing.
clearing that cache results in much slower memory growth. I'm seeing a small growth rate ~10Mb / 5 thousand trials
Ok, thanks for the update. I'll try to push an update tomorrow to make this the default behavior as stated in my comment above.
Maybe a context manager.
with scarlet.cache():
# run deblender in a way that benefits from a cache
Hmm. There aren't too many things we put in the cache. From Erin's script, I'm guessing that it's the monotonicity operator at the original source x/y position. We use that for initialization, and I thought I had removed it from the cache. If not, it'll accumulate in the cache with every new source until the process terminates.
@esheldon: can you check the keys of the Cache
(it's a simple dict) in the case when you don't clear it to confirm, pls.
I think the more likely culprit is the build_initialization_image
that is stored in cache. This could/should probably be removed after the init, since it's the detection image that is the size of the entire blend.
Yeah, I just found the same. This things should go out of scope when the current observation
is dropped, but it isn't: https://github.com/pmelchior/scarlet/blob/3e83d8e294813de35cb9cb57d33a35a3429b936a/scarlet/initialization.py#L248-L280
We used to have this data structure as member of the observation class, in which that problem wouldn't occur. But we changed it for multi-observation runs because this is the init image for all of the overlapping observations. In practice, this ability isn't even used because we don't have a multi-instrument init image.
I wonder if we shouldn't clean this up by creating something like a MultiObservation
class to replace the flat list
of Observations
we carry around (and test for) in lots of places. Then we can attach extra functionality to this class instead of the global cache.
I have just spent some time addressing this same issue, and found this issue as i was about to post my own.
I'm running scarlet version 1.0.1+gd1bda26, which was current when i installed about 3 weeks ago.
What i have done is to start with a set of images, weights, psfs, etc., and then to run through the steps of the deblending process; i.e. create the frame, the observation, initialization, blending, and then finally rendering the models. I do this N times for the same dataset, and track the resident memory after every iteration. The tests i have done is to stop after successive steps in the process; i.e. first only doing the frame step; then doing both the frame and observation and observation steps, etc.
Below is a plot that shows the results. When reading the legend, e.g. 'initialization' implies the process has been run up to and including the initialization step, but no further. Clearly the intialisation step is the worst culprit. That said, there is very small but real linear memory growth associated with the creation of the frames and observations.
The final curve shows the result of implementing the cache clearing step suggested by Fred3m here. I don't at all understand the shape of this curve. Suffice to say that clearing the cache resolves a lot but not all of the problems. It seems that (for my case at least) scarlet is carrying about ~0.5 Gb of unnecessary memory. I guess that this is kind of on the edge of caring about ?
Thanks for reporting. The offending cache call is in fact still present in the current code base. The reason is that we want to re-use the same initialization image for all sources in the same observation. We could tie this image the the Observation
instance, but this doesn't work if we want to initialize from multiple of such instances, which arises for dithered observations or observations with different instruments. As I wrote above, the current code base doesn't have this functionality, so we should just make the change that actually fixes the current state.
As for the other 0.5 GB, scarlet needs a few somewhat large operators for images at various sizes. It builds them once it needs them for a source, that's why that line keeps growing. I wouldn't call that "unnecessary" memory.
The memory usage is growing linearly with calls to the deblender.
For example processing 100x100 images the memory usage starts quite minimal, about 100-200 megabytes and within a few thousand trials reached 2.6g.
On the batch system at BNL the jobs are eventually getting killed by the automatic job killer; I don't have reliable stats on what the usage was but I believe it must have exceeded 8 g for this to occur. This was again using 100x100 images.