spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
561 stars 167 forks source link

JP-3631: remove direct setting of self.skip within calibration steps #8600

Closed emolter closed 2 months ago

emolter commented 3 months ago

Resolves JP-3631

Closes #8498

This PR removes all instances where a step sets its own .skip attribute, including the one in the record_step_status() helper function. That pattern causes potential issues; see, for example, this PR and therefore should be avoided. In a few instances where the calibration step was checking a step's .skip attribute after running the step, the step result's metadata is now being checked instead.

Checklist for PR authors (skip items if you don't have permissions or they are not applicable)

emolter commented 3 months ago

Regression tests started here

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 66.00000% with 17 lines in your changes missing coverage. Please review.

Project coverage is 59.60%. Comparing base (064208c) to head (fd9b2f4). Report is 1 commits behind head on master.

Files Patch % Lines
jwst/tweakreg/tweakreg_step.py 12.50% 7 Missing :warning:
...st/master_background/master_background_mos_step.py 16.66% 5 Missing :warning:
jwst/master_background/master_background_step.py 60.00% 2 Missing :warning:
jwst/cube_build/cube_build_step.py 66.66% 1 Missing :warning:
jwst/pipeline/calwebb_spec3.py 50.00% 1 Missing :warning:
jwst/pixel_replace/pixel_replace_step.py 66.66% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #8600 +/- ## ========================================== + Coverage 59.56% 59.60% +0.03% ========================================== Files 391 391 Lines 39285 39286 +1 ========================================== + Hits 23402 23418 +16 + Misses 15883 15868 -15 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

emolter commented 3 months ago

new regtest round started here

emolter commented 3 months ago

In the regtests above, there was one failure (test_masterbkg_corrpars) related to master_background_mos_step that appears to have been caused by the slit.source_name attributes in the test data containing the string "BACKGROUND" when the function nirspec_utils.is_background_msa_slit() was expecting "BKG". It looks like those tests haven't been changed in a while, but this PR seems to have changed the naming convention.

The question is, why was this test passing before, and failing only now as of this PR? I think the answer is due to unexpected/bad behavior when setting self.skip - I need to check this, but if the first call to step.run failed for any reason, it would have set the step status to SKIPPED and then the second call would have also skipped, therefore making both of their outputs identical (and identical to the inputs).

In terms of how to fix that, should I allow nirspec_utils.is_background_msa_slit() to check for "BACKGROUND" or "BKG", like I'm doing here in the most recent commit, or should we change the test dataset?

melanieclarke commented 3 months ago

In terms of how to fix that, should I allow nirspec_utils.is_background_msa_slit() to check for "BACKGROUND" or "BKG", like I'm doing here in the most recent commit, or should we change the test dataset?

I think it wouldn't hurt to allow either, in case there are other older datasets around.

braingram commented 3 months ago

Did you notice if any steps are now saving results when skipped (when they weren't previously)? I ask because stpipe checks Step.skip when saving resulting: https://github.com/spacetelescope/stpipe/blob/915a8326432254bd5d03ee67586bc94c79d74327/src/stpipe/step.py#L536 I would expect that, for example, tweakreg would now save results if provided an input association with 1 group when configured to skip absolute alignment (to trigger the skip on line 208).

emolter commented 3 months ago

Putting some relevant points from a meeting Brett and I had here:

This line in stpipe is checking the step.skip parameter to decide whether files should be saved when save_results is True. With the changes in this PR, that means that if save_results==True but a step status is set to SKIPPED, output files will be created when they would not have before. We agreed that this should be handled in stpipe somehow, perhaps by making step.finalize_result() check the metadata of the result and returning a value that indicates whether the step had been skipped.

We discussed whether record_step_status and query_step_status should be put into stpipe, along with variables similar to the one suggested in this comment for the SKIPPED, COMPLETE, and NOTSET constants. In a perfect world it would be good to achieve similarity between Roman and JWST in the way their pipeline steps are set up and the way stpipe handles them, e.g. this open issue which is a problem for Roman but not Webb. What do other people think about moving these to stpipe?

If we do decide to do something in stpipe, it remains unclear whether this should hold up merging this PR. If the creation of files when save_results is True even when steps are skipped is not a serious problem, then I'd say we can go ahead and move toward merging it. If it is a serious problem, then we should coordinate this PR with an stpipe release that includes at minimum the aforementioned change to finalize_result. What do you think @nden @hbushouse?

melanieclarke commented 3 months ago

If the creation of files when save_results is True even when steps are skipped is not a serious problem, then I'd say we can go ahead and move toward merging it.

This actually sounds like the more desirable behavior to me -- if save_results is True, then it was either True by default, in which case it is a required output file, or else it was True because the user explicitly set it and is expecting an output file to be saved. Even if it's only different from the input file by a header keyword, it's sometimes useful to save an output file at an intermediate point in the pipeline, whether or not the step immediately prior actually did anything useful.

In either case, I don't think it's a serious enough problem to hold up the resolution for this ticket.

emolter commented 3 months ago

The style check failure seems to be due to some kind of version bump of ruff; does anyone know anything about that?

ruff==0.5.0
check-style: commands[0]> ruff .
error: `ruff <path>` has been removed. Use `ruff check <path>` instead.
braingram commented 3 months ago

The style check failure seems to be due to some kind of version bump of ruff; does anyone know anything about that?

ruff==0.5.0
check-style: commands[0]> ruff .
error: `ruff <path>` has been removed. Use `ruff check <path>` instead.

Wee woo wee woo, calling @zacharyburnett :) Have you seen this anywhere else? Thanks!

zacharyburnett commented 3 months ago

The style check failure seems to be due to some kind of version bump of ruff; does anyone know anything about that?

ruff==0.5.0
check-style: commands[0]> ruff .
error: `ruff <path>` has been removed. Use `ruff check <path>` instead.

Wee woo wee woo, calling @zacharyburnett :) Have you seen this anywhere else? Thanks!

oh yes! ruff uses ruff check <path> now (because they added ruff format <path> as well); I've updated several other style checks in other projects to use that, but I must have missed this one

The tool.ruff table in pyproject.toml is also now separated into tool.ruff.lint and tool.ruff.format to enable / disable rules and include / ignore paths.

emolter commented 3 months ago

regression tests again

emolter commented 3 months ago

even more regtests. I will merge this if these don't turn up any failures

emolter commented 3 months ago

This is ready to merge. Waiting because this should go into Build 11.1, and other bugfixes still need to go into 11.0