Closed sbailey closed 1 month ago
I meant to post this here instead of #2263; reposting here to keep the comment with the ticket that will be open until we fix the underlying problem.
Belt-and-suspenders-and-duct-tape: zproc knows that tile-qa needs the coadd and redrock files as input:
INFO:util.py:128:runcmd: RUNNING: desispec.scripts.tileqa.main(['-g', 'cumulative', '-n', '20220404', '-t', '23551']) Inputs /global/cfs/cdirs/desi/spectro/redux/jura/tiles/cumulative/23551/20220404/coadd-0-23551-thru20220404.fits /global/cfs/cdirs/desi/spectro/redux/jura/tiles/cumulative/23551/20220404/redrock-0-23551-thru20220404.fits ... but ideally it should also know about needing the exposure-qa files so that it would stop with an informative error messages about what inputs are missing before even trying. The process of tile-qa generating the exposure-qa was primarily useful for daily when it needed to "catch up" on old exposures before exposure-qa was automatically generated. But by now, exposure-qa should really be generated by the pipeline and missing it should be an error condition.
I suggest that we still fix the night vs. exposure_night bug and leave the auto-generation in place, but not rely upon it for normal operations.
To which @akremin replied
+1 on that last point. It appears that there was an oversight in committing exposure-qa from the list of inputs. We absolutely should have it there, which should resolve issues such as those encountered in Jura.
Although thinking about this more -- that will make it impossible for tile-qa to create the exposure-qa, since daily using this same code. So it might not have been an oversight but rather an explicit choice to allow daily to run effectively... So including exposures-qa in the inputs may not be as clear of a "win" as I originally thought.
Fixed in PR #2306; closing.
Followup to #2263:
ztile jobs require the cframe files from all nights/expids to exist to make the combined spectra files, and also the exposure-qa files from all nights/expids to make the tile-qa file. We do not track cross-night dependencies for ztile -> tilenight jobs, but instead just let them run and crash if they don't find the cframes they need, and then resubmit them. However, this procedure doesn't work if the cframes exist but the exposure-qa does not. An example is the Jura processing of tile 23551 was observed on 20220403 and 20220404:
night
vs.exposure_night
but, it failed to re-generate the exposure-qa and treated that exposure as QA_EFFTIME=0, causing the tile to not pass QA.Action items:
night
vs.exposure_night
bug identified in #2263; and/orFor the purposes of Jura, I'm going to remove the tile-qa files and rerun tile-qa for the impacted tiles so that they will pick up the previously-missing-but-now-existing exposure-qa files.
Adding this to the Kilimanjaro dashboard.