Closed jdavies-st closed 7 months ago
Also, if it is a pipeline that is run, the configuration is printed for the pipeline and all its constituent steps first, and then each step repeats the printing of the configuration at each step, so redundant. It would be best to print configuration at the beginning once only.
The redundancy is not strictly true: Parameter references for a specific step can also be specified, which are not retrieved until the step itself is executed. These can override what is in pipeline-level parameter references.
To remove redundancy, the logging could be limited to just the parameters relevant to the pipeline/step being executed. This would primarily mean that the logging at the pipeline-level would not show all the substep parameters.
Parameter references for a specific step can also be specified, which are not retrieved until the step itself is executed.
Parameter reference files for specific steps within a pipeline are retrieved from CRDS and applied right at the beginning when the pipeline itself is instantiated. An example:
$ strun calwebb_image3 jw02234-o001_20230627t013403_image3_00001_asn.json --skip=true --save-parameters params_nircam.asdf
2024-02-09 16:24:34,715 - stpipe - INFO - PARS-TWEAKREGSTEP parameters found: /data/beegfs/astro-storage/groups/jwst/common/crds_cache/jwst_ops/references/jwst/nircam/jwst_nircam_pars-tweakregstep_0058.asdf
2024-02-09 16:24:35,329 - stpipe - INFO - PARS-SOURCECATALOGSTEP parameters found: /data/beegfs/astro-storage/groups/jwst/common/crds_cache/jwst_ops/references/jwst/nircam/jwst_nircam_pars-sourcecatalogstep_0023.asdf
2024-02-09 16:24:35,450 - stpipe.Image3Pipeline - INFO - Image3Pipeline instance created.
2024-02-09 16:24:35,451 - stpipe.Image3Pipeline.assign_mtwcs - INFO - AssignMTWcsStep instance created.
2024-02-09 16:24:35,453 - stpipe.Image3Pipeline.tweakreg - INFO - TweakRegStep instance created.
2024-02-09 16:24:35,455 - stpipe.Image3Pipeline.skymatch - INFO - SkyMatchStep instance created.
2024-02-09 16:24:35,456 - stpipe.Image3Pipeline.outlier_detection - INFO - OutlierDetectionStep instance created.
2024-02-09 16:24:35,458 - stpipe.Image3Pipeline.resample - INFO - ResampleStep instance created.
2024-02-09 16:24:35,459 - stpipe.Image3Pipeline.source_catalog - INFO - SourceCatalogStep instance created.
2024-02-09 16:24:35,917 - stpipe.Image3Pipeline - INFO - Step Image3Pipeline running with args ('jw02234-o001_20230627t013403_image3_00001_asn.json',).
2024-02-09 16:24:35,925 - stpipe.Image3Pipeline - INFO - Step Image3Pipeline parameters are: {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': True, 'skip': True, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'steps': {'assign_mtwcs': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': True, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': 'assign_mtwcs', 'search_output_file': True, 'input_dir': ''}, 'tweakreg': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': True, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'save_catalogs': False, 'use_custom_catalogs': False, 'catalog_format': 'ecsv', 'catfile': '', 'kernel_fwhm': 2.535, 'snr_threshold': 10.0, 'sharplo': 0.2, 'sharphi': 1.0, 'roundlo': -1.0, 'roundhi': 1.0, 'brightest': 200, 'peakmax': None, 'bkg_boxsize': 400, 'enforce_user_order': False, 'expand_refcat': False, 'minobj': 15, 'searchrad': 2.0, 'use2dhist': True, 'separation': 2.0, 'tolerance': 1.0, 'xoffset': 0.0, 'yoffset': 0.0, 'fitgeometry': 'shift', 'nclip': 3, 'sigma': 3.0, 'abs_refcat': '', 'save_abs_catalog': False, 'abs_minobj': 15, 'abs_searchrad': 6.0, 'abs_use2dhist': True, 'abs_separation': 0.1, 'abs_tolerance': 0.7, 'abs_fitgeometry': 'rshift', 'abs_nclip': 3, 'abs_sigma': 3.0}, 'skymatch': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'skymethod': 'match', 'match_down': True, 'subtract': False, 'stepsize': None, 'skystat': 'mode', 'dqbits': '~DO_NOT_USE+NON_SCIENCE', 'lower': None, 'upper': None, 'nclip': 5, 'lsigma': 4.0, 'usigma': 4.0, 'binwidth': 0.1}, 'outlier_detection': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': False, 'input_dir': '', 'weight_type': 'ivm', 'pixfrac': 1.0, 'kernel': 'square', 'fillval': 'INDEF', 'nlow': 0, 'nhigh': 0, 'maskpt': 0.7, 'grow': 1, 'snr': '5.0 4.0', 'scale': '1.2 0.7', 'backg': 0.0, 'kernel_size': '7 7', 'threshold_percent': 99.8, 'ifu_second_check': False, 'save_intermediate_results': False, 'resample_data': True, 'good_bits': '~DO_NOT_USE', 'scale_detection': False, 'allowed_memory': None, 'in_memory': False}, 'resample': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None, 'search_output_file': True, 'input_dir': '', 'pixfrac': 1.0, 'kernel': 'square', 'fillval': 'INDEF', 'weight_type': 'ivm', 'output_shape': None, 'crpix': None, 'crval': None, 'rotation': None, 'pixel_scale_ratio': 1.0, 'pixel_scale': None, 'output_wcs': '', 'single': False, 'blendheaders': True, 'allowed_memory': None, 'in_memory': True}, 'source_catalog': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits', 'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': 'cat', 'search_output_file': True, 'input_dir': '', 'bkg_boxsize': 1000, 'kernel_fwhm': 2.535, 'snr_threshold': 3.0, 'npixels': 25, 'deblend': False, 'aperture_ee1': 30, 'aperture_ee2': 50, 'aperture_ee3': 70, 'ci1_star_threshold': 2.0, 'ci2_star_threshold': 1.8}}}
2024-02-09 16:24:35,925 - stpipe.Image3Pipeline - INFO - Step skipped.
2024-02-09 16:24:35,925 - stpipe.Image3Pipeline - INFO - Step Image3Pipeline done
And the reported parameters of all steps (before any steps are run) include the differences from pars-tweakregstep
and pars-sourcecatalogstep
reffiles from CRDS when compared to the code defaults for that step:
$ diff params_nodata.asdf params_nircam.asdf -y --suppress-common-lines
date: '2024-02-09T15:22:03' | date: '2024-02-09T15:24:35'
save_results: false | save_results: true
skip: false | skip: true
fitgeometry: rshift | fitgeometry: shift
kernel_fwhm: 2.5 | kernel_fwhm: 2.535
separation: 1.0 | separation: 2.0
tolerance: 0.7 | tolerance: 1.0
kernel_fwhm: 2.0 | kernel_fwhm: 2.535
I've also tested with inputing a custom pipeline-level pars- reffile and a custom step-level one, and it's all merged and correct before any process()
methods are run.
So I don't think there's any merging that happens after the run()
starts on the pipeline.
I think Jonathan's example would look more like
strun calwebb_spec2 jw01128012001_03102_00002_nrs1_rate.fits --steps.pixel_replace.config_file='pars-pixrep.asdf'
But I did just test this, and with my step-pars file specifying a value for pixel_replace.n_adjacent_cols of 11 (over the default 3), I see in the initial pipeline-level parameter dump:
...
, 'pixel_replace': {'pre_hooks': [], 'post_hooks': [], 'output_file': None, 'output_dir': None, 'output_ext': '.fits',
'output_use_model': False, 'output_use_index': True, 'save_results': False, 'skip': False, 'suffix': None,
'search_output_file': True, 'input_dir': '', 'algorithm': 'fit_profile', 'n_adjacent_cols': 11},
...
So I think the step-level pars files are incorporated at the time of pipeline instantiation? I remember attempting to fix a bug with the overlapping levels of parameter specification in the past, perhaps this was changed in the past year or two.
Sorry, bad memory on my part then. Withdraw the original comment.
Cool. I'll do an implementation of this fix then. Hopefully it will make the logging a bit easier to follow for users, which these days includes me. =)
Currently when a step runs, it outputs a log of the step parameters, or if it's a pipeline, the pipeline and constituent step parameters. But it is basically unreadable, as one can see for example for
jwst.pipeline.Detector1Pipeline
:It would be much better to have formatted output. Here's 3 options:
Use
yaml.dump()
Use
json.dumps
Use
pprint.pformat()
@hbushouse and @nden, 👍 or 👎? And if so, which of the 3 is preferable?