spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
562 stars 167 forks source link

Review jwst code for step.skip usage #8498

Closed stscijgbot-jp closed 2 months ago

stscijgbot-jp commented 4 months ago

Issue JP-3631 was created on JIRA by Melanie Clarke:

Steps defined in stpipe always have a 'skip' attribute; when this is set to True, the step is skipped.  This is available to users as a top-level parameter for each step.

The skip attribute should rarely be set by code within a pipeline step, since it may have unintended consequences.  For example, currently, when the outlier_detection step is run on NIRSpec MOS data, if skip is set for one slit, the step is then skipped for all subsequent slits.  This fix is addressed in the PR for JP-2922.  A similar fix for the tweakreg step is addressed in PR 8476 (https://github.com/spacetelescope/jwst/pull/8476).

We should review remaining steps for similar issues and make sure that the skip attribute is only set by the user or at the pipeline level, not internal to any step.  In particular, it looks like skip is set to True in the core class JwstStep when record_step_status is called with success=False.  This function should probably either be deprecated or updated to avoid setting skip directly.

stscijgbot-jp commented 4 months ago

Comment by Melanie Clarke on JIRA:

This issue was noted by Howard Bushouse in review for tweakreg and outlier_detection updates. Howard, please edit or comment to clarify as needed.

stscijgbot-jp commented 4 months ago

Comment by Howard Bushouse on JIRA:

I'm guessing that when record_step_status was originally written it was assumed that it would only be used once per high-level pipeline processing run, as opposed to the use case where a step gets called repetitively within a single pipeline run. The function itself is probably still useful, for convenience, in some instances, but I agree that we should rethink or eliminate setting self.skip=True.

jdavies-st commented 3 months ago

I can confirm record_step_status() was written to centralize updating the FITS header on files that had steps skipped in a consistent way. Generally it should not set self.skip, as it is not needed. So line

https://github.com/spacetelescope/jwst/blob/380dc5eb082c6c5d4e024a9337999f69cd5b8443/jwst/stpipe/core.py#L110

can be removed. It is unnecessary, but also it should not have any bad effect, as this function is called at the end of a step, and step instances should not be reused. This method updates the equivalent of the PRIMARY header, so it by definition should only be called once per datamodel (FITS file), though it should be able to be called on N times for steps that output N datamodels. self.skip should only have an effect if the step instance is reused by a pipeline, which is a big no-no for the stpipe Step infrastructure.

stscijgbot-jp commented 3 months ago

Comment by Ned Molter on JIRA:

Noting here the places where I find direct setting of self.skip within a step:

And here the places where I find record_step_status:

As a random aside, the record_step_status function seems a bit underutilized, and I'm sure there are instances where this could (should?) be used to e.g. set the metadata attributes for all the slits in a MultiSlitModel.  I guess that change would be low-priority and beyond the scope of this PR, but still

stscijgbot-jp commented 3 months ago

Comment by Ned Molter on JIRA:

In addition, the pipeline codes are checking whether the step.skip attribute was set to True after the step had been run for master_background and cube_build.  With the change that removes setting of that attribute within the step, these checks need to be changed to checks of the meta.step.skipped metadata of the result instead.

stscijgbot-jp commented 1 month ago

Comment by David Law on JIRA:

Melanie Clarke This ticket is current set to RTT, but I'm unclear what testing can be performed or what if any work remains?

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Fixed by #8600

stscijgbot-jp commented 1 month ago

Comment by Melanie Clarke on JIRA:

Thanks for pointing this out.  This was more of an internal refactoring, so no additional specific tests should be needed from INS.