MPoL-dev / MPoL

A flexible Python platform for Regularized Maximum Likelihood imaging
https://mpol-dev.github.io/MPoL/
MIT License
33 stars 11 forks source link

Further savings to DirtyImager memory usage #236

Closed jeffjennings closed 9 months ago

jeffjennings commented 9 months ago

Is your feature request related to a problem or opportunity? Please describe. Memory footprint of DirtyImager was reduced in #230 by making the data (u, v, ReV, ImV, weights) properties of the class. These are only used in DirtyImager by the private function _grid_visibilities and the function get_dirty_image that calls _grid_visibilities. The memory footprint could be further reduced, significantly for large datasets, by calling _grid_visibilities when DirtyImager is instantiated and storing the gridded data, rather than making the loose data properties that have to be pulled into memory by _grid_visibilities whenever get_dirty_image is called.

I don't think the loose data really need to be kept in DirtyImager for another reason (since no other routine internal to it uses them), do they? If not, I could implement this.

iancze commented 9 months ago

I think that it is just keeping a reference to the array, not copying it, since arrays are passed by reference. If I was interpreting the memray results correctly, this understanding was consistent with the total memory used.

Before the Hermitian concat operation had not only doubled the size of the array, but also resulted in a new array to keep track of.

One way to check this would be to create a copy or a throwaway branch and implement an operation in DirtyImager that modifies the values of the self.uu array. Once the routine exits, the modification should be present in the original array that was passed into init.

jeffjennings commented 9 months ago

Yes right and I guess it would require a rewriting of some parts of GridderBase too. Anyway, just a thought.

iancze commented 8 months ago

My interpretation of how things are working is that GridderBase already stores/d values by default, and therefore has/d a minimal memory footprint as far as storing the dataset is concerned (performance inherited by DataAverager).

The old version ofDirtyImager was different because it had a concat operation that copied the original data arrays to a new array that contained the Hermitian pairs.