Closed emolter closed 3 weeks ago
Attention: Patch coverage is 89.09396%
with 65 lines
in your changes missing coverage. Please review.
Project coverage is 60.77%. Comparing base (
8381a26
) to head (364cd60
). Report is 2 commits behind head on master.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Latest round of regression tests here. These unfortunately include 50 failures that are all related to the delivery of new MIRI reference files and other MIRI changes on the INS side. None of the failures from the previous regtest run are still present, so I'm pretty confident this PR is passing all regtests now and is ready for review
I've finished viewing and commenting on all the files in this PR. Overall I think the changes look great and the comments are mostly minor. Thanks for taking on this work! I'll follow up with comments but please let me know if there's more I can do to help to move this PR along.
I ran a 972 member association (~100GB input data) through calwebb_image3
using this PR (an "on disk" library was used as input and a slightly older commit 26e543627d9ed8e28af6475f0f3fe7f8ab7862cf). The pipeline succeeded and the recorded memory usage (using memray) was as follows:
I will add this information and a link to the memory profile to the jira ticket. The peak memory usage was 50GB and this is largely due to the context array generated during resample. Importantly for this PR, at no point does the pipeline load all input data into memory.
testing just test_nircam_image.py here, because it takes forever on my local machine
Thanks again for all your work on this. I went through the files and left a new set of comments. It looks very close to done.
New round of regression tests started here after responding to comments from Brett's second review.
edit: file name issues recurred, apparently the small bug setting pool and table names in ModelLibrary was not the cause. Need to put back in the manual setting of model.meta.filename
. Starting another round here
Another round of regtests after all the nircam image and miri image failures are fixed is here. Fingers crossed that the only failures are the same ones that show up on the nightly runs
I believe that all the regression tests are unrelated
Why there is still usage for the ModelContainer
? Is ModelLibrary
not completely replacing ModelContainer
? Is it only for backward compatibility to support ModelContainer
inputs?
Why there is still usage for the
ModelContainer
? IsModelLibrary
not completely replacingModelContainer
? Is it only for backward compatibility to supportModelContainer
inputs?
We decided to only replace container with library in the calwebb_image3 pipeline for now, where the memory performance gains are most important. There are other complications with replacing this for the spectroscopic modes too, e.g., the SourceModelContainer class that uses ModelContainer as a parent class needs to be replaced also
Another round of regression tests have been started here after fixes per Melanie's review.
Edit: lots of failing tests, but they match last night's nightly run
Resolves JP-3690 Resolves JP-3619 Resolves JP-3620 Resolves JP-3621, see JP-3707 for additional work on resample memory usage beyond the scope of this ticket. Resolves JP-3498 Work is under the epic JP-3602.
Fixes #8649 Fixes #8478 Fixes #8479 Fixes #8480 Fixes #8164
This PR is part of a larger effort to improve memory usage throughout the pipeline. Here, the AbstractModelLibrary class in stpipe is subclassed for JWST, and the pipeline steps that form the calwebb_image3 pipeline are updated to use ModelLibrary instead of ModelContainer. This facilitates ensuring that the pipeline step runs the same way whether models are loaded into memory or saved to disk.
When ModelLibrary is used with on_disk=True, memory usage is lower, both for individual pipeline steps and for calwebb_spec3 as a whole. An analysis of memory usage for each step can be found in the linked tickets JP-3619, JP-3620, and JP-3621. An analysis of the memory usage as a whole, when run on a large dataset, can be found in JP-3690.
Checklist for PR authors (skip items if you don't have permissions or they are not applicable)
CHANGES.rst
within the relevant release section