spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
557 stars 164 forks source link

Clean up unnecessary copies #8673

Open stscijgbot-jp opened 1 month ago

stscijgbot-jp commented 1 month ago

Issue JP-3695 was created on JIRA by Melanie Clarke:

In working on JP-3610, we noted that there are sometimes unnecessary copies in pipeline steps, e.g. a copy of the input data is made at the top of the step, then another copy is made in the core algorithm when processing begins.  

We should review all pipeline steps to make sure that only necessary copies are made, for performance optimization.

stscijgbot-jp commented 1 month ago

Comment by Maria Pena-Guerrero on JIRA:

Working on cleanup for steps in Detector1 in #8676

stscijgbot-jp commented 1 month ago

Comment by Maria Pena-Guerrero on JIRA:

The steps in Image3 will not be changed as part of this ticket.

stscijgbot-jp commented 1 week ago

Comment by Maria Pena-Guerrero on JIRA:

I did a couple of tests with MIRI image files, since these are the most affected by the memory increase.  One file is part of our regression tests, jw00001001001_01101_00001_mirimage_uncal.fits, and is about 1 GB in size. On both master and the branch, this file took about 3 min to finish and the maximum memory used was about 11 GB. However, the branch run is slightly faster to both finish and reach the max memory usage. Plots:

branch  !memory_det1_branch_1GB_3min_max11GB.png!

master 

!memory_det1_master_1GB_3min_max11GB.png!

The other file was jw01283001001_03101_00001_mirimage_uncal.fits, which has a size of 2.64 GB. On master, this file took 60 min to run with a maximum memory usage of about 25 GB. On the branch, the file took 25 min to run with a maximum memory usage of about 22 GB. Plots:

branch

!memory_det1_new_branch_2GB.png!

master

!memory_det1_new_master_2GB.png!

stscijgbot-jp commented 1 week ago

Comment by Maria Pena-Guerrero on JIRA:

For completion, here is the pip freeze of my testing environment:

[^pip_freeze.txt]

stscijgbot-jp commented 5 days ago

Comment by Maria Pena-Guerrero on JIRA:

another test with the file I am using to write a memory regression test: jw01024001001_04101_00001_mirimage_uncal.fits

Still,  the branch is slightly faster

branch !memory_det1_small_branch.png|thumbnail!

 

master !memory_det1_small_master.png|thumbnail!