Closed penaguerrero closed 4 months ago
I'm not sure why, but when I run with this branch it's causing a crash in the gain_scale step with 'datamodels' not defined. Doesn't make sense to me, as this PR didn't change that step, but it doesn't happen on the latest master branch.
Weirdly, the list of files changed on this PR doesn't include gain_scale_step.py, but looking at the code it changes when checking out the branch, getting rid of a 'from stdatamodels.jwst import datamodels'. Why doesn't this change show up in the list of files changed?
EDIT: I see why; the relevant branch is jp3592, not the speedupemicorr branch, which is a little confusing.
Attention: Patch coverage is 91.30435%
with 2 lines
in your changes missing coverage. Please review.
Project coverage is 59.56%. Comparing base (
23fb5cd
) to head (b0fced2
).
Files with missing lines | Patch % | Lines |
---|---|---|
jwst/emicorr/emicorr.py | 91.30% | 2 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I'm not sure why, but when I run with this branch it's causing a crash in the gain_scale step with 'datamodels' not defined. Doesn't make sense to me, as this PR didn't change that step, but it doesn't happen on the latest master branch.
Weirdly, the list of files changed on this PR doesn't include gain_scale_step.py, but looking at the code it changes when checking out the branch, getting rid of a 'from stdatamodels.jwst import datamodels'. Why doesn't this change show up in the list of files changed?
EDIT: I see why; the relevant branch is jp3592, not the speedupemicorr branch, which is a little confusing.
yes, sorry about that naming confusion. This is because while working on this I made up a workaround in order to test that large file you pointed me to (which was crashing detector1 on my computer). So the workaround to save up memory usage in the whole Detector1 pipeline.
It looks like this PR currently reduces the runtime from 12 minutes to 7 minutes for one of the long example cases that I've been testing (jw01283001001_03101_00001_mirimage), and most of the remaining time is spent in the np.where call within the loop over nbins.
I've just filed https://github.com/penaguerrero/jwst/pull/2 to help address this by restricting the array to only the integration planes for which information has been populated in phaseall. It also replaces use of the custom sigma-clipping routine with the astropy sigma_clipped_stats routine, avoiding the need for a search for finite values.
This further reduces the overall runtime to about 1 minute total, which means we should now be getting about a factor of ten improvement from where we were before.
Edit: This PR now also fixes a bug noticed by Eddie in which the correction isn't being applied to anything other than the first integration. Fixing this increases runtime again to about 2 minutes, but still acceptable for now given the bug fix.
thank you @drlaw1558, I was working on a way around that where too but this is faster
LG2M now; over to you @hbushouse
Has a regtest run been done on this recently?
Started another regtest run after updating truth files for unrelated changes: https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1565
Started another regtest run after updating truth files for unrelated changes: https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1565
Latest results show failures/differences only in MIRI tests that include that include the Detector1 pipeline, and hence the emicorr step, as expected. A lot of differences are very small, but some are significant. Need to determine if the larger differences are expected and acceptable.
@hbushouse I just spot checked the lrs_slitless and imaging failure cases by running them locally with/without these changes. The differences look ok to me; it's the expected changes from emicorr now actually removing the EMI signal from integration 2+, which in turn has some impact on whether or not CRs are detected in the ramp, sometimes affecting the jump flagging and the final signal. Some larger differences at intermediate stages in the vicinity of detector artifacts, where the emicorr adjustment is fractionally much larger. Note to myself from this that I think the MIRI team may need to update the bad pixel map to mask out these artifacts more completely.
@hbushouse I just spot checked the lrs_slitless and imaging failure cases by running them locally with/without these changes. The differences look ok to me; it's the expected changes from emicorr now actually removing the EMI signal from integration 2+, which in turn has some impact on whether or not CRs are detected in the ramp, sometimes affecting the jump flagging and the final signal. Some larger differences at intermediate stages in the vicinity of detector artifacts, where the emicorr adjustment is fractionally much larger. Note to myself from this that I think the MIRI team may need to update the bad pixel map to mask out these artifacts more completely.
OK, if you're happy, I'm happy.
By the way, I don't see any explicit tests of the emicorr step itself in any of the regtests. The only way we see differences due to the step is via changes to downstream results (i.e. in later Detector1 steps). Really need to update some of the MIRI regtest modules to include checking of the emicorr step output, e.g. test_miri_image.py and test_miri_lrs_slitless.py). But that can be done later in a separate PR.
Resolves JP-3592
Closes #8412
This PR improves the running time of emicorr by introducing a new parameter, use_n_cycles, that can be modified by the user and has a default of 3. Now the user can either provide the number of integrations to phase or the number of cycles to calculate the number of integrations necessary to phase.
Checklist for PR authors (skip items if you don't have permissions or they are not applicable)
CHANGES.rst
within the relevant release section