Open bhilbert4 opened 6 years ago
Input will be a table containing (for example):
index reffile_type filename required_pipeline_steps
1 gain file_a_uncal.fits bpm,sat
2 readnoise file_b_rate.fits bpm,sat,superbias
3 readnoise file_a_uncal.fits bpm,sat,superbias
For each row in the table, we need to: 1) Search the given directory for all flavors of the file. For example, in the top row above, we need to find all versions of file_a, including uncal, rate, dq_init, etc. 2) For all found files, determine which pipeline steps have already been run and 3) Create a list of pipeline steps that still need to be run to bring the file up to the required state given in column 4. 4) Based on the lists from 3), choose the file closest to the desired end state 5) Create an output file name for the file after it will be run through the remaining required pipeline steps. Add this output name in a new column of the input table 6) Create an strun command to bring the file up to the desired state 7) Return the updated table and the list of strun commands
**Complicating factors: User may specify that found, partially processed files should be ignored and uncal files only should be used. User may specify reference files to override with in the strun commands Same filename may appear in multiple rows of the input table. In this case, we need to be smart enough to run the pipeline on this file only once and save the appropriate multiple outputs corresponding to the multiple rows in the input table.
Output table example:
index reffile_type filename required_pipeline_steps output_filename
1 gain file_a_uncal.fits bpm,sat file_a_sat.fits
2 readnoise file_b_sat.fits bpm,sat,superbias file_b_superbias.fits
3 readnoise file_a_uncal.fits bpm,sat,superbias file_a_superbias.fits
Output strun command examples:
strun calwebb_detector1.cfg file_a_uncal.fits --steps.dark.skip=True --steps.linearity.skip=True --steps.saturation.override_saturation='my_satfile.fits' --steps.saturation.output_file='file_a_sat.fits' --steps.superbias.output_file='file_a_superbias.fits'
strun calwebb_detector1.cfg file_b_sat.fits --steps.dark.skip=True --steps.linearity.skip=True --steps.superbias.output_file='file_b_superbias.fits'
When would be the best time to look for repeated input filenames? First thing after reading in the table? Or the last step before generating the strun names? Saving it for last might be easier from the standpoint of updating all rows in the input table.
Create the code that will take a list of file base names and required SSB calibration pipeline steps, and generate strun commands to convert the input base files to the proper data reduction state.