JP-3570: Match NaNs and DO_NOT_USE flags

melanieclarke commented 3 months ago

Closes #8345

Make a function that updates a datamodel with NaN values in data, error, or variance arrays whenever the DO_NOT_USE flag is set, and updates the dq array to DO_NOT_USE whenever a NaN value is found in data, error, or variance arrays.

Call the function on output data at the end of several key steps where DO_NOT_USE might be set:

flat_field
pathloss
photom
outlier_detection

Also call it on input data at the beginning of a couple key steps where data are combined:

resample/resample_spec
cube_build

Checklist for PR authors (skip items if you don't have permissions or they are not applicable)

[x] added entry in CHANGES.rst within the relevant release section
[x] updated or added relevant tests
[ ] ~updated relevant documentation~
[x] added relevant milestone
[x] added relevant label(s)
[x] ran regression tests, post a link to the Jenkins job below. How to run regression tests on a PR
[x] All comments are resolved
[ ] Make sure the JIRA ticket is resolved properly

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 88.00000% with 9 lines in your changes missing coverage. Please review.

Project coverage is 60.83%. Comparing base (8381a26) to head (918a428). Report is 4 commits behind head on master.

Files with missing lines	Patch %	Lines
jwst/pathloss/pathloss.py	16.66%	5 Missing :warning:
jwst/flatfield/flat_field.py	80.00%	1 Missing :warning:
jwst/lib/pipe_utils.py	97.43%	1 Missing :warning:
jwst/outlier_detection/ifu.py	50.00%	1 Missing :warning:
jwst/resample/resample_spec_step.py	85.71%	1 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #8557 +/- ## ========================================== + Coverage 60.63% 60.83% +0.19% ========================================== Files 372 373 +1 Lines 38372 38636 +264 ========================================== + Hits 23267 23503 +236 - Misses 15105 15133 +28 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

melanieclarke commented 3 months ago

Started regression tests here: https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1552/

I predict many new NaNs!

melanieclarke commented 3 months ago

Now that we are past build 11.0, I think this is ready for discussion and testing.

Regression test results:

Many new NaNs in intermediate products in SCI/ERR/VAR arrays, most notably outputs from outlier_detection, flat_field, and photom.
Imaging source catalogs show differences because NaNs are not ignored in photometry (e.g. test_miri_image3_catalog)
NIRSpec IFU shows changes in the DQ array after flat_field: NON_SCIENCE areas are now also flagged DO_NOT_USE.
There are some small changes to values in the SURF_BRIGHT column only in x1d products (e.g. test_nirspec_mos_spec3[v000000053-x1d])
There are some small changes to weight values in s2d files (e.g. test_nirspec_mos_spec3[v000000053-s2d])
There are some changes to values in NIRCam coronography final products (e.g. test_nircam_coron3_product[i2d]).

melanieclarke commented 2 months ago

Thanks for reviewing @emolter.

is the idea that the NaNs in the data and variance arrays are always synced with the DO_NOT_USE flags in the DQ array? If so, should we consider making some kind of check for this generic to all pipeline steps? I guess it isn't obvious to me that the steps where you added this are the major/only places this would be helpful.

I think that's the idea, but I was reluctant to add it to every single pipeline step individually in the first pass. I added it to steps that I know either modify DO_NOT_USE or set NaNs in data arrays. I can add it to more if needed.

I didn't consider whether it could or should be done more generically, in a core function or class. It does need some knowledge of the output model composition, so that it does the updates at the right level. Do you have a better sense of where that might go?

I'm glad you checked edge cases where certain extensions are not found - not sure I would have thought of that - but why is the preferred behavior in those cases to return a copy of the datamodel instead of failing?

I was thinking that it should be as minimally invasive as possible, so it can be called on all kinds of datamodels, without assuming that they have all the extensions at all times. As is, it could be used on partial copies of datamodels as well as on the final products that should have all the expected extensions.

emolter commented 2 months ago

I didn't consider whether it could or should be done more generically, in a core function or class. It does need some knowledge of the output model composition, so that it does the updates at the right level. Do you have a better sense of where that might go?

I don't know if I have the expertise or experience to answer the should question, but I am sure it could be implemented. If that were the case I'd suggest putting it somewhere into jwst.stpipe, maybe jwst.stpipe.utils, and then calling it from the core JWST Step class. It reminds me a bit of the query_step_status function we just implemented with regards to knowledge of the model composition - is there anything more complicated here than what is implemented in that query_step_status function? That is, do something to the model if it's not a Sequence, loop over all models in the sequence if it is?

melanieclarke commented 2 months ago

I looked into making this change more general today. It looks like it would be straightforward to add a call to match_nans_and_flags to the finalize_result function in the JwstStep class, with iteration handling for MultiSlitModel and ModelContainer results. This would replace all the places that this function is called in this PR, except that it should still be called on the inputs to the resampling and cube build steps.

However, this opens an even bigger can of worms than the existing changes, since every step in every pipeline would call the function, and not all of the datamodel outputs have the data the current function expects. For example, RampModels have sci and err extensions, but not dq -- they have pixeldq and groupdq. And the output from the ramp_fit step has all the right extensions with matching shapes, but var_flat is not written to the output because it's all zero. With this change, var_flat is updated with NaNs to match the other extensions, and is then written out, even though the flat_field step never runs in detector1. Also, spectral data models have a different structure than Image/Cube/Slit models and would need separate handling if we wanted to meaningfully update them.

I'll bring this up in the ticket for guidance.

melanieclarke commented 2 months ago

Per @drlaw1558 - we should keep the implementation limited to these specific steps for now, to avoid scope creep and possible unintended consequences.

melanieclarke commented 1 month ago

Thanks, @drlaw1558 for coordinating the testing! I'll rebase and fix up the conflicts with later development and see if we can get this in.

melanieclarke commented 1 month ago

Outlier detection changed significantly after I initially wrote this PR, so I'm re-requesting review from @emolter and @braingram, if you have a chance.

Re-run for regression tests is here: https://github.com/spacetelescope/RegressionTests/actions/runs/10638921459

Re-re-run on Jenkins, since the number of failures overwhelmed the report in GitHub actions: https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1682/

melanieclarke commented 4 weeks ago

~Regression test results are inconclusive. There are many new failures introduced by this branch, as expected, but it looks like there are also some unrelated failures that need okifying. It's hard to tell which of the 135 failures are which.~

~I will run again when things are more stable.~

Edit: I'm looking at the right nightly results now and can see which changes are unrelated. Reviewing now.

melanieclarke commented 4 weeks ago

I think the regression test output is as expected, compared to the previous run.

One thing I noticed in addition to the differences noted above: i2d and crf output for image3 is sometimes different (e.g. test_miri_image3_i2d) because tweakreg is seeing a different set of sources to cross match on. It might be worth looking into how the starfinder is handling NaNs: I wouldn’t have expected to see much difference here, since the DO_NOT_USE mask is already passed.

melanieclarke commented 4 weeks ago

It looks like match_nans_and_flags will not get called during the outlier detection step if resample_data is False for imaging or spectroscopic modes. Is that true, and if so, is that the intended behavior?

It should still be called in that case, via flag_model_crs. I'll test.

drlaw1558 commented 4 weeks ago

Hm, looking into this MIRI imaging case you linked above quickly. I'm getting results that are different from the artifactory reference data, but not because of this PR (results I get with jwst master seem identical to what I get using this branch). MIRI just changed a lot of reference files though, including all of the darks, and those may be the reason for the difference.

melanieclarke commented 4 weeks ago

@emolter - confirmed, it's definitely called for image and spec outlier detection when resample_data is False.

@drlaw1558 - the new ref files are definitely confusing things in comparing to the current truth files, but I think there is additionally a change in tweakreg due to this branch. I ran the regtest locally with the same CRDS context on the current main and PR branches and compared the log output.

On main, tweakreg reports:

Detected 171 sources in jw01024001001_04101_00002_mirimage_cal.fits.
Detected 175 sources in jw01024001001_04101_00003_mirimage_cal.fits.
Detected 167 sources in jw01024001001_04101_00004_mirimage_cal.fits.
Detected 150 sources in jw01024001001_04101_00001_mirimage_cal.fits.
...
Final solution based on 111 objects.

On this branch, tweakreg reports:

Detected 155 sources in jw01024001001_04101_00002_mirimage_cal.fits.
Detected 161 sources in jw01024001001_04101_00003_mirimage_cal.fits.
Detected 153 sources in jw01024001001_04101_00004_mirimage_cal.fits.
Detected 149 sources in jw01024001001_04101_00001_mirimage_cal.fits.
...
Final solution based on 109 objects.

It's not a big change, and I don't think it should hold up this PR, but we might want to follow up on it later to make sure the star finder isn't handling NaNs badly.

melanieclarke commented 3 weeks ago

Regression tests one more time, to test the changes with the ModelLibrary work: https://plwishmaster.stsci.edu:8081/job/RT/job/JWST-Developers-Pull-Requests/1689/

melanieclarke commented 3 weeks ago

Cross matching regression tests to the previous run, I don't see anything new or unexpected, so I think this is ready to go.

@tapastro, over to you!

melanieclarke commented 3 weeks ago

Okifying is done. Let me know if you see problems.

spacetelescope / jwst

JP-3570: Match NaNs and DO_NOT_USE flags #8557

Codecov Report