use resample many_to_many in outlier detection

braingram commented 4 months ago

This PR changes outlier detection to use many_to_many when resampling during median calculation. This PR produces 1 expected difference in regression test results. https://plwishmaster.stsci.edu:8081/job/RT/job/Roman-Developers-Pull-Requests/805/

With the current code on main the test_level3_mos_pipeline test runs an association with 3 models (each from a different exposure) through the mosaic pipeline. Setting a breakpoint just prior to the call to create_median: https://github.com/spacetelescope/romancal/blob/ed6187ffa790fc6a37dd96f7365ba1ef8261fa74/romancal/outlier_detection/outlier_detection.py#L108-L111 we can see that len(drizzled_models) == 1. This conflict with the algorithm description in the docs that states:

Each dither position will result in a separate grouped mosaic, so only a single exposure ever contributes to each pixel in these mosaics.

Looking at the outlier detection unit tests I don't see one that both:

runs the step with resample_data=True
introduces outliers The only unit test that reaches: https://github.com/spacetelescope/romancal/blob/e559890ee4679835b11e98e17f33ac819733176d/romancal/outlier_detection/outlier_detection.py#L86-L88 is test_skymatch_always_returns_modelcontainer_with_updated_datamodels which doesn't introduce any outliers and runs the step with all 0 data.

Changing test_outlier_do_detection_find_outliers (which introduces and detects outliers) to use resampling is insufficient to show the impact of this PR. Without this PR, the introduced CRs (value=1E5) and 'drizzled' together with empty pixels from the other image (value=0). This produces a single drizzled_model with a value of 5E4 for each PR. The median (N=1) produces the same value, when blotted back to the input image wcs the 5E4 values are far enough below the 1E5 values of the CRs to allow them to be detected. This is true for any value used for the CR (since the input images are noiseless with all 0 error).

Furthermore the test appears to be checking that all introduces CRs (even those introduced in img2) are flagged in img1: https://github.com/spacetelescope/romancal/blob/e559890ee4679835b11e98e17f33ac819733176d/romancal/outlier_detection/tests/test_outlier_detection.py#L253 returns 10 flagged CRs:

(array([ 5, 15, 25, 35, 45, 65, 75, 85, 95, 99]), array([45, 25, 25,  5, 85, 65, 65,  5, 45,  5]))

whereas only 5 were introduced into img1: https://github.com/spacetelescope/romancal/blob/e559890ee4679835b11e98e17f33ac819733176d/romancal/outlier_detection/tests/test_outlier_detection.py#L204-L206

Switching single (so that now many_to_many is used) revealed a few other issues included:

drizzled models overwriting each other due to filenames sharing the same base
drizzled models output including only the last used group
resample background correction check not matching the check in resample

The updated unit test in this PR uses 3 images with the first 2 having CRs, each having 1 "source" and checks that:

CRs added to image 0 are flagged in image 0
CRs added to image 1 are flagged in image 1
no CRs are flagged in image 2

Checklist

[ ] added entry in CHANGES.rst under the corresponding subsection
[ ] updated relevant tests
[ ] updated relevant documentation
[ ] updated relevant milestone(s)
[ ] added relevant label(s)
[ ] ran regression tests, post a link to the Jenkins job below. How to run regression tests on a PR

codecov[bot] commented 4 months ago

Codecov Report

Attention: Patch coverage is 66.66667% with 2 lines in your changes missing coverage. Please review.

Project coverage is 79.30%. Comparing base (79d3a30) to head (b923030). Report is 193 commits behind head on main.

Files with missing lines	Patch %	Lines
romancal/resample/resample.py	66.66%	2 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #1260 +/- ## ========================================== + Coverage 79.24% 79.30% +0.06% ========================================== Files 117 117 Lines 8075 8065 -10 ========================================== - Hits 6399 6396 -3 + Misses 1676 1669 -7 ``` | [Flag](https://app.codecov.io/gh/spacetelescope/romancal/pull/1260/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=spacetelescope) | Coverage Δ | | *Carryforward flag | |---|---|---|---| | [nightly](https://app.codecov.io/gh/spacetelescope/romancal/pull/1260/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=spacetelescope) | `62.78% <ø> (ø)` | | Carriedforward from [79d3a30](https://app.codecov.io/gh/spacetelescope/romancal/commit/79d3a30f7a751e9b80721d269d3f733e28bcd5f0?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=spacetelescope) | *This pull request uses carry forward flags. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=spacetelescope) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

schlafly commented 4 months ago

This looks fine to me, but @mairanteodoro and you should discuss, and I suspect that there are related unit tests that will need updating.

braingram commented 4 months ago

Thanks @schlafly!

@mairanteodoro and I have a meeting tomorrow to discuss the sky subtraction (which is partially handled in this PR and partially in https://github.com/spacetelescope/romancal/pull/1233). After updating the flag_outlier unit test for this PR I don't think the sky subtraction was the only issue so there are a number of changes in this PR.

I'll open this PR for review once the unit and regtests finish (I expect at least the mosaic regtest to fail and any others that use outlier detection).

schlafly commented 4 months ago

Presumably the regtests change meaningfully with this PR; would you mind including something like the new image, old image, and the difference? Thank you!

braingram commented 4 months ago

Presumably the regtests change meaningfully with this PR; would you mind including something like the new image, old image, and the difference? Thank you!

I think the only regtest that is impacted is the mosaic pipeline test. This uses only 3 images and produces very minor differences in the number of detected outliers. With main:

2024-06-04 16:06:53,489 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 6373 (0.04%)
2024-06-04 16:06:54,670 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 11289 (0.07%)
2024-06-04 16:06:55,898 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 6441 (0.04%)

with this PR:

2024-06-04 16:01:20,422 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 6144 (0.04%)
2024-06-04 16:01:21,795 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 15462 (0.09%)
2024-06-04 16:01:22,888 - stpipe.MosaicPipeline.outlier_detection - INFO - New pixels flagged as outliers: 6151 (0.04%)

Overall I don't think the output change is very meaningful given the current regression tests. Here is the stpreview by <fn> 16 16 output for the truth file: truth and the new output with this PR: new

schlafly commented 3 months ago

I agree that I can't see anything in the preview images; the stars are too small and just look like hot pixels. I don't actually think the L2 files are that problematic; some of the structure in the background is from the non-linearity reference files having issues. The spatial gradient is a little unexpected but is probably very low amplitude. By eye it doesn't look to me like we're masking all of the stars, for example, but have you actually looked at the generated outlier masks?

braingram commented 3 months ago

I agree that I can't see anything in the preview images; the stars are too small and just look like hot pixels. I don't actually think the L2 files are that problematic; some of the structure in the background is from the non-linearity reference files having issues. The spatial gradient is a little unexpected but is probably very low amplitude. By eye it doesn't look to me like we're masking all of the stars, for example, but have you actually looked at the generated outlier masks?

I looked at the outlier masks and there aren't large differences. Since main has seen some changes since I made those comparisons they'll need to be updated. The mosaic test only uses 3 images so the median across groups (many_to_many) vs the drizzled combinations of all groups (many_to_one) produces similar "median" data.

braingram commented 3 months ago

@schlafly Is there more you'd like to see before merging?

I re-ran the regtests here: https://plwishmaster.stsci.edu:8081/blue/organizations/jenkins/RT%2FRoman-Developers-Pull-Requests/detail/Roman-Developers-Pull-Requests/816/tests

Looking at the output image (a cropped region [200:800, 400:1000]) the truth file shows the following (log scaled):

with this PR the image is similar but wih a few less CRs:

braingram commented 3 months ago

@schlafly Is there anything else you'd like to see for this PR?

schlafly commented 3 months ago

No, thanks, I thought I approved, thank you!

spacetelescope / romancal

use resample many_to_many in outlier detection #1260

Codecov Report