spacetelescope / jwst

Python library for science observations from the James Webb Space Telescope
https://jwst-pipeline.readthedocs.io/en/latest/
Other
568 stars 167 forks source link

Step status flag should be set via a class method #3091

Open jdavies-st opened 5 years ago

jdavies-st commented 5 years ago

Lots of our pipeline steps set the keywords in meta.cal_steps to 'COMPLETE' or 'SKIPPED' depending on whether the data can be handled by the step. For steps which handle single or multiple inputs, this means a lot of boilerplate, often multiple times within a single step:

            try:
                result.meta.cal_step.master_back_sub = 'COMPLETE'
            except AttributeError:
                for model in result:
                    model.meta.cal_step.master_back_sub = 'COMPLETE'

If a step is skipped via

strun calwebb_image2.cfg --steps.skip.tweakreg=True

The meta.cal_step.tweakreg is not set to 'SKIPPED' automatically, and in fact we don't handle this properly currently. The cal_step does not get set at all.

So we should have a Step class method that can set the step's cal_step keyword to the correct value in these circumstances. It would have something like the form:

    def set_step_success(self, status=True):
        pass

and each step class would need to define a class attribute that records the name of the property in meta.cal_step where the step stores its status. This may have to be defined in an __init__ method for each step.

TODO:

Related to #2439.

jdavies-st commented 1 year ago

One thing that would make this easier to implement would if the meta.cal_step values for a step were the same as step alias. Currently that's the case for most, but not all steps:

from jwst.stpipe import integration
from stdatamodels.jwst import datamodels

# Get an instance of the core schema
model = datamodels.ImageModel()

# Get a set of the meta.cal_step names
cal_steps = set()
for cal_step in model.meta.cal_step._schema['properties'].keys():
    cal_steps.add(cal_step)

# Get a set of the Step aliases
aliases = set()
for _, alias, _ in integration.get_steps():
    aliases.add(alias)

The current results are

In [38]: aliases.intersection(cal_steps)
Out[38]: 
{'ami_analyze',
 'ami_average',
 'ami_normalize',
 'assign_mtwcs',
 'assign_wcs',
 'barshadow',
 'combine_1d',
 'cube_build',
 'dq_init',
 'extract_1d',
 'extract_2d',
 'firstframe',
 'flat_field',
 'fringe',
 'gain_scale',
 'group_scale',
 'guider_cds',
 'imprint',
 'ipc',
 'jump',
 'klip',
 'lastframe',
 'linearity',
 'master_background',
 'mrs_imatch',
 'msa_flagging',
 'outlier_detection',
 'pathloss',
 'persistence',
 'photom',
 'pixel_replace',
 'ramp_fit',
 'refpix',
 'resample',
 'reset',
 'residual_fringe',
 'rscd',
 'saturation',
 'skymatch',
 'source_catalog',
 'srctype',
 'straylight',
 'superbias',
 'tso_photometry',
 'tweakreg',
 'wavecorr',
 'white_light'}

As you can see, most step aliases are the same as the cal_step meta attribute.

But some are not, and some don't exist in the cal_step attributes. (Ignore the calwebb_ pipelines)

In [39]: aliases.difference(cal_steps)
Out[39]: 
{'align_refs',
 'background',
 'calwebb_ami3',
 'calwebb_coron3',
 'calwebb_dark',
 'calwebb_detector1',
 'calwebb_guider',
 'calwebb_image2',
 'calwebb_image3',
 'calwebb_spec2',
 'calwebb_spec3',
 'calwebb_tso3',
 'calwebb_wfs-image3',
 'cube_skymatch',
 'dark_current',
 'hlsp',
 'master_background_mos',
 'outlier_detection_scaled',
 'outlier_detection_stack',
 'resample_spec',
 'stack_refs',
 'undersampling_correction',
 'wfss_contam'}

In [40]: cal_steps.difference(aliases)
Out[40]: 
{'align_psfs',
 'back_sub',
 'dark_sub',
 'emission',
 'err_init',
 'stack_psfs',
 'wfs_combine'}
tapastro commented 1 year ago

This PR didn't do the complete job - it would probably help for readability to align some of these attribute:alias pairs, and it would clean up the steps for which the existing code will fail. However, some of those mismatches appear to be caused by some steps having multiple aliases depending on datatype, i.e. master_background + master_background_mos - unless we add a cal_step entry for each step alias, we would appear to need an additional bit of machinery.