desihub / desispec

DESI spectral pipeline
BSD 3-Clause "New" or "Revised" License
36 stars 24 forks source link

Add script to update processing table column layout #2376

Closed akremin closed 1 month ago

akremin commented 2 months ago

This adds a command line script, desi_reformat_proctables, to update the processing table column layout. As of now, there have only been two versions of the processing tables. The old tables used CAMWORD instead of PROCCAMWORD used in the modern processing tables to represent the cameras used in the processing. The old tables also lack BADAMPS, LASTSTEP, and EXPFLAG.

The script takes a list of nights, range of nights, or "all" to run on all processing tables.

For each identified table, the script will rename CAMWORD to PROCCAMWORD if found. If BADAMPS, LASTSTEP, EXPFLAG, or any other column defined in desispec.workflow.proctable.get_processing_table_column_defs() is missing, the code will add the column where the value of each row is the default for that column type. If other columns are found in the existing table that aren't in the list of current columns given in desispec.workflow.proctable.get_processing_table_column_defs(), the column is removed.

Tests

I ran this at NERSC on a copy of the daily processing tables. The processing tables and a notebook verifying that the output files have all the expected columns can be found here: /global/cfs/cdirs/desi/users/kremin/PRs/modernize_proctables/daily/

The old files that are replaced have a *.replaced-{TIMESTAMP}.* added to the file name. The new files have the expected name for a processing table of a specprod on a given night.

coveralls commented 2 months ago

Coverage Status

coverage: 30.091% (-0.07%) from 30.164% when pulling 43cd53fdafe217fc57f25a3775fd40d0c784028d on modernize_proctables into 97b174838a2e951a32aa719f19da07e184585c39 on main.

akremin commented 2 months ago

I have made all of the requested changes. Unfortunately, now there is an issue with a docstring somewhere. I will resolve that tomorrow morning.

All nights on or before 20210208 require this reformatting. I will also investigate whether 20210328 needs reformatting that this is missing.

I also failed to mention that I ran tests at nersc on a copy of the daily processing tables. The processing tables and a notebook verifying that the output files have all the expected columns can be found here: /global/cfs/cdirs/desi/users/kremin/PRs/modernize_proctables/daily/

The old files that are replaced have a *.replaced-{TIMESTAMP}.* added to the file name. The new files have the expected name for a processing table of a specprod on a given night.

akremin commented 2 months ago

@sbailey I have fixed the docstring issue. I also double-checked why 20210328 did not need updating. The crash on that night is unrelated to the processing tables format. The issue for that night is in updating results from the queue. The solution for 20210328 might require a further code change, but I'd like #2351 and #2376 merged first if you're happy with both. That way we can get all of the tables in the same format and I can adapt the queue-checking code off of #2351 which also modifies that code.