exposure pipeline stops processing association on first fully saturated input

braingram commented 3 days ago

If I generate 2 associations using 1 fully saturated and 1 non-saturated input:

in 2 different orders and run each through the ExposurePipeline I get 2 different results:

for the association with the saturated model first, elp returns a list with a single model corresponding to the saturated input
for the association with the saturated model second, elp return a list with 2 models corresponding to the unsaturated and then saturated input

I failed to find a description of this in the documentation.

Is it expected that the elp will stop processing inputs when it encounters the first saturated input?

schlafly commented 8 hours ago

Thanks Brett. @ddavis-stsci probably knows the history here best; I imagine that there is some requirement somewhere saying that we have to handle fully saturated inputs, and for those cases most of the pipeline will not doing anything sane and so it makes sense to exit early.

IMO this should not happen very often; I feel like if a device is fully saturated we have done something very bad and are running of the risk of damaging something.

Absent requirements, we should just take the simplest route to exit early for these exposures. I agree with you that it makes more sense to stop the individual fully saturated detector than to stop all of the detectors; I think we expect SDP to process one detector at a time anyway. I suspect that any tests we may have only operate on a single detector, and so this behavior could be changed without affecting existing tests.

But because this is such a rare case we should focus on whatever the cleanest / easiest / lowest maintenance solution is, rather than trying to save many cycles here or something.

braingram commented 7 hours ago

Thanks. Skipping steps for a saturated input makes sense. The skipping of inputs was more surprising.

For processing one detector at a time, what will the input association look like (or will the elp only be given a single uncal file)? Skipping inputs might not always be a problem (if the expectation is that a single saturated uncal will always be followed by equally saturated uncals).

schlafly commented 7 hours ago

For the ELP I think the early steps are just run on the single uncal files in the current SDP implementation. I don't think we should code around that assumption---I'm mostly saying that that would be a sane implementation, and agreeing with you that it's awkward if that implementation gives different results than passing the whole association (except for tweakreg, where the behavior is conceptually different).

I am having trouble of deciding when to expect fully saturated images, since as I mentioned, if this happens I worry it's instrument damaging. In my other experience this happens, e.g., when you accidentally open the shutter during the day or start doing twilight flats too early or don't notice that there's a cloud illuminated by the setting sun during twilight making the twilight sky much brighter than usual.

In these kinds of instances it's likely that the whole array would be hit.

In real science observations we will end up putting very bright stars on the actual detectors and not their neighbors. It doesn't seem likely to me that that ever results in fully saturating the detectors. e.g., we pointed DECam at alpha Cen and blasted a big portion of one detector, much of that detector was fine. From a raw number of photons perspective Roman won't ever be hit more badly than that.

But ultimately I don't think we should assume very much about what fully saturated data will look like since it seems to me that if we ever see such data it will be a weird event. Maybe there's some kind of calibration data where they want to figure out what fully saturated is and they intentionally fully saturate the detector?? That wouldn't be science data we would want to reduce, though.

braingram commented 6 hours ago

Thanks. Perhaps INS has some input about if/when a fully saturated input might be expected (more on that below).

I traced some changes back to https://github.com/spacetelescope/romancal/pull/541 with a referenced ticket https://jira.stsci.edu/browse/RCAL-353 which mentions:

This was discussed with INS on July 7th. The decision is to stop processing right after the step that determines all pixels are saturated (saturation or ramp_fitting) and return a "cal.asdf" file with all zeros in the image array. It should be a level 2 ImageModel and the file should have the regular L2 "cal" suffix, in order to be ingested in the archive. All steps which are not executed should be marked as "SKIPPED" in the "cal_step" structure. There should be a descriptive log message indicating the reason processing was stopped.

Looking at the truth file for the "all saturated" test https://bytesalad.stsci.edu/ui/repos/tree/General/roman-pipeline/dev/truth/WFI/image/r0000101001001001001_0001_wfi01_ALL_SATURATED_cal.asdf

the result is a RampModel stored in a "cal" suffix file and the arrays aren't zeroed.

> m = rdm.open("test_bigdata/roman-pipeline/dev/truth/WFI/
    ...: image/r0000101001001001001_0001_wfi01_ALL_SATURATED_cal.asdf")
> type(m)
roman_datamodels.datamodels._datamodels.RampModel
> m.data[0, 0, 0]
65535.0

The association processing was added to elp at a later date https://github.com/spacetelescope/romancal/pull/802 which dropped the usage of create_fully_saturated_zeroed_image (which might explain why the result is no longer a level 2 ImageModel). That PR also introduced other changes to fully saturated processing and finally https://github.com/spacetelescope/romancal/pull/824 introduced the behavior described here where elp will stop processing inputs when it encounters the first fully saturated input.

spacetelescope / romancal

exposure pipeline stops processing association on first fully saturated input #1523