Closed akremin closed 5 months ago
Thanks! I have run workflow.calibration_selection.find_best_arc_flat_sets
on all nights and have started sorting them into various cases. I just added unit tests for two non-standard cases that I think we should try to support. If they are tricky to support right now, we could also flag tests as expected failures and address them in a future PR.
UPDATE: these are fixed in the logic of commit #21f5641 if we further increase the allowable arcflattimediff
(current default 20 minutes). TBD.
We don't have to support every non-standard case in this PR, but FWIW 20211207 is an interesting case. Something went wrong during the desi-calib-03 flats, but the observers were able to retake just that station immediately afterward with minor delay:
EXPID OBSTYPE SEQNUM SEQTOT EXPTIME LASTSTEP BADCAMWORD PROGRAM DATETIME
------ ------- ------ ------ -------- -------- ---------- ----------------------------- -----------------------
112753 flat 1 3 120.0274 all calib desi-calib-00 leds only 2021-12-07 22:53:27.566
112754 flat 2 3 120.0253 all calib desi-calib-00 leds only 2021-12-07 22:56:35.603
112755 flat 3 3 120.0282 all calib desi-calib-00 leds only 2021-12-07 22:59:43.737
112758 flat 1 3 120.0284 all calib desi-calib-01 leds only 2021-12-07 23:03:18.145
112759 flat 2 3 120.0279 all calib desi-calib-01 leds only 2021-12-07 23:06:26.175
112760 flat 3 3 120.0292 all calib desi-calib-01 leds only 2021-12-07 23:09:33.804
112763 flat 1 3 120.0272 all calib desi-calib-02 leds only 2021-12-07 23:13:09.235
112764 flat 2 3 120.0297 all calib desi-calib-02 leds only 2021-12-07 23:16:17.282
112765 flat 3 3 120.0293 all calib desi-calib-02 leds only 2021-12-07 23:19:25.341
112768 flat 1 3 120.0298 ignore calib desi-calib-03 leds only 2021-12-07 23:23:01.245
112769 flat 2 3 120.0295 ignore calib desi-calib-03 leds only 2021-12-07 23:26:09.327
112770 flat 3 3 120.0267 ignore calib desi-calib-03 leds only 2021-12-07 23:29:16.974
112771 flat 1 1 1.002 ignore led03 flat for cte check 2021-12-07 23:32:24.615
112777 flat 1 3 120.0296 all calib desi-calib-03 leds only 2021-12-07 23:48:45.527
112778 flat 2 3 120.0297 all calib desi-calib-03 leds only 2021-12-07 23:51:53.601
112779 flat 3 3 120.0278 all calib desi-calib-03 leds only 2021-12-07 23:55:02.063
112780 flat 1 1 1.002 all led03 flat for cte check 2021-12-07 23:58:10.332
If the parsing started by removing LASTSTEP='ignore' cases, it probably would have identified this as a normal set of flats. Instead it only includes the EXPID= 112780 one second flat. But perhaps in other case it is helpful for it to distinguish "exposure exists but should be ignored" from "exposure doesn't exist".
UPDATE: this is fixed in the next commit "Re-implement the flat selection..."
Recoverable by adjusting find_best_arc_flat_sets(..., arcflattimediff=20.)
threshold,
but need to pick what that should be.
UPDATE: arcflattimediff updated to 80 minutes, recovering all of these.
UPDATE: entire night is flagged as LASTSTEP=ignore, so we don't need to process it at all.
NIGHT | ZERO | DARK | ARC | FLAT | Comment |
---|---|---|---|---|---|
20211128 | no | no | ok | ok | A: link bias,badcol but not psf,fiberflat |
20220923 | ok | ok | no | no | B: link psf,fiberflat, but not bias,badcol |
20221222 | ok | ok | no | no | B: link psf,fiberflat, but not bias,badcol |
20221228 | ok | ok | no | no | B: link psf,fiberflat, but not bias,badcol |
20231216 | ok | ok | no | no | B: link psf,fiberflat, but not bias,badcol |
20210130 | ok | ok | ok | no | C: link fiberflat, but not bias,badcol,psf |
20210218 | ok | ok | ok | no | C: link fiberflat, but not bias,badcol,psf |
20211009 | ok | ok | ok | no | C: link fiberflat, but not bias,badcol,psf |
20221120 | ok | ok | ok | no | C: link fiberflat, but not bias,badcol,psf |
20231018 | ok | ok | ok | no | C: link fiberflat, but not bias,badcol,psf |
IIUC, override include_prefixes and exclude_prefixes are not yet supported. I was surprised to find that these "partial cals" were the dominant case, not "link everything".
Caveat: 20211212 is a case with some cameras flagged as bad butdetermine_calibrations_to_proc
still returned a complete set of exposures, which is valid for b and z but not r. There might be other cases with missing cameras that I haven't caught yet.
It's ok to punt some of these to future PRs as long as the current PR design doesn't make that needlessly hard. Anything that we do punt to future PRs should be added as new focused tickets when we close this one.
Time gaps between arcs and flats: the core issue is that changes in the humidity and/or temperature between the arcs and the flats can cause trace shifts that invalidate the ability to extract the flats with psfs made from the arcs. Shifts in y cannot be directly measured because the flats don't have sharp features to calibrate to, thus the need to take arcs and flats back to back before anything changes.
I compared the temperature and humidity differences between the last arc and the first flat for ~80 nights (blue), and compared those to the dT and dH for the 4 nights with larger than normal time differences (orange). The humidity difference is a little on the large side, but not too unusual and it is fine to accept these as normal arc/flat pairs. I'll update the arcflattimediff
parameter to 80 minutes to capture all of these cases as normal.
Looks good! It is especially helpful to have good unit tests for this workflow logic. Anthony also ran a night processing daily data with real job submissions, and other non-unittest dry run tests also pass. Merging now.
Overview
The main purpose of this code is to reduce redundancy and improve maintainability by having one script that can submit data as it comes in (like
desi_daily_proc_manager
) and data already acquired (likedesi_run_night
). This resolves issue #2175. Because those are functioning scripts and useful for testing against the new code, they have been retained in this branch and a new script has been added that can replace both of them. After vetting and with time, those other scripts can be removed in a future update todesispec
.The new script is called
desi_proc_night
(meant to intentionally paralleldesi_proc_tilenight
anddesi_proc
). The script has many of the same options as the old scripts, but with some removed that I no longer thought were necessary and a few new ones added. The major new flag to the script is--daily
. That is a meta-flag in that when set it sets several other values to the expected default for daily processing. The code is smart enough to know whether it is actually running during the night in question or not. If it is, it will keep updating the exposure tables, won't run unless cte flats exist, and won't submit the last science tile observed. Once morning has been reached, it is run in daily mode after the night has ended, or it isn't in daily mode it will process the last science tile and process whatever calibrations it can find.This also adds the machinery to create and submit batch a script for desi_link_calibnight introduced in PR #2165 . This resolved issue #2174 . It also creates a new "JOBDESC" in the processing tables called "linkcal" that indicates the linking job for tracking and dependencies. The information is stored in per-night files parallel to the
exposure_table_NIGHT.csv
files in SVN. The naming scheme isoverride-NIGHT.yaml
. If the linking information isn't defined in an override file and there aren't sufficient flats for calibrations, then the code will raise an error.Here is a minimal example of the override file I have been using for testing (since I don't provide an example in desispec). Note that this format is likely to change and evolve as the files are used for more things:
There is also a new script to update the exposure table as a standalone script since the function was necessary for the refactor and a script wrapper seemed useful. To create the function and script I had to rename an existing set of code that reformatted the exposure_tables to add/remove columns. That old code is now called
reformat_exptables
/desi_reformat_exposure_tables
.This also renames
desispec.workflow.procfuncs
todesispec.workflow.processing
for more clarity and addsexposure_table_NIGHT.csv
,processing_table_SPECPROD-NIGHT.csv
, andunprocessed_table_SPECPROD-NIGHT.csv
to thedesispec.io.meta.findfile()
.Tests
1) Using
dry_run_level=3
showed it can run with--daily
flag set for a complete night. 1) Usingdry_run_level=3
showed it can run without--daily
flag set for a complete night given an exposure_table is available. 1) Usingdry_run_level=1
showed that it can run and produce scripts for a nominal reprocessing night. 1) Usingdry_run_level=1
showed that it can run and produce scripts using--daily
. 1) Modified exposure table to be missing flats and usingdry_run_level=1
showed that it will submit up to arcs but then raise an error because there are no flats and not in--daily
mode. 1) Usingdry_run_level=1
and--daily
mode showed that it will submit what is available before exiting without submitting an incomplete set of flats. 1) Usingdry_run_level=1
, a modified version of--daily
mode after the actual date, and a modified exposure table missing some flats and showed that it will submit what is available before raising an error 1) Modified exposure table to be missing flats, added an override file to link to a different night's calibrations, and usingdry_run_level=1
showed that it will link to the previous night and not submit any other calibrations for processing. 1) Modified exposure table to be missing flats, added an override file to link to a different night's calibrations, and allowed the linking job to be submitted and run. The new script successfully ran and produced the desired links. 1) Usingdry_run_level=1
showed that it can run and produce scripts using--daily
.There are still more tests to be performed, primarily to run the code without
dry_run_level
to ensure that outputs are equivalent to those indaily
oriron
. I will update this PR as those tests are run.