Closed rlwastro closed 1 week ago
@snbianco @falkben Just tagging you to keep this fix on your radar. I added a bit of text to the description. I only recently realized this is not just an impact on speed. Cutouts returned in memory also apparently return a view into the full (uncompressed) array, meaning the memory usage of the underlying data can be much larger than the cutout. I have an example that extracts 250 cutouts of size 256x256 pixels (total size about 65 MB). The resulting list from the current code was going to take 39 GB (!), which overflowed the 15 GB memory limit on the machine I was using. The new code using .section
works correctly and avoids the memory issues.
I think we should try to get this merged soon as possible since it has some serious impacts on performance.
Attention: Patch coverage is 87.50000%
with 1 line
in your changes missing coverage. Please review.
Project coverage is 95.82%. Comparing base (
11b0567
) to head (667fd83
). Report is 8 commits behind head on main.
Files with missing lines | Patch % | Lines |
---|---|---|
astrocut/cutouts.py | 87.50% | 1 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Very nice. Sorry it took a while for me to take a look at this. I wonder if we need similar change (.data
-> .section
) for this part of code:
https://github.com/spacetelescope/astrocut/blob/main/astrocut/cutout_processing.py#L463
and maybe that would speed up the moving target cutouts considerably?
@falkben That's a good question. I think the answer is yes, we should make the same change there. If this code that pieces together images from different skycells gets applied to Pan-STARRS images (which is on my list of things to work on), it will be much faster if .data
is changed to .section
in that code.
I believe these changes will not make much difference in speed for uncompressed images (e.g., TESS data cubes and HAP-MVM images). But I have not actually done timing tests on uncompressed images in the S3 bucket to see whether it has an effect on those images too. And it is possible that the .section
approach will improve memory handling even for uncompressed images. I think it can't hurt in any case to make the change you suggest.
Agreed! I'll make this change on the same MR as the updated unit tests and do some benchmarking as well.
@rlwastro @falkben After looking into this a bit more, I don't think that changing line 463 in cutout_processing.py
will work. build_default_combine_function
initializes a function to combine an array of cutouts, and the entirety of each cutout has to be loaded in to memory to check for NaN values and compute the pixel weights.
@snbianco I have not quite been able to figure out what the parameters are for build_default_define_function
. But assuming that they are already image cutouts that have been extracted from the larger images, I think you are correct. And in fact that line of code:
img_arrs = np.array([hdu.data for hdu in template_hdu_arr])
... only really makes sense if the data
attributes are used rather than the section
attributes. It implies that we are stacking up all the individual cutouts into a big cube.
So yes, I agree with you!
Accessing the
.data
attribute on a compressed image causes the entire image to be decompressed and read into memory. That defeats the ability of astrocut to reduce the IO requirements by reading only the necessary data.Another benefit of the change is that it greatly reduces the memory requirements for the returned cutout. The current version using the data section returns a subset of the full image array, which winds up dragging the entire array along with it. The
.section
approach does not include the entire array.This change instead uses the
.section
attribute to access the data, both when reading the pixels and when checking to see which extensions have images. The only remaining use of the.data
attribute is for access to the cutout pixels. The implementation of thesection
attribute makes it efficient for accessing data in AWS S3 buckets as well.The speedup ranges from a factor of 2 (for large cutouts) to at least a factor of 10 (for small cutouts).
This change also fixes a bug in the application of the
memory_only
flag. When true, it should not write any output files to disk. However, that was ignored whensingle_outfile
is false. This fixes that bug. It was already working correctly whensingle_outfile
is true.