desihub / desispec

DESI spectral pipeline
BSD 3-Clause "New" or "Revised" License
36 stars 24 forks source link

dashboard: highlighting failures vs. not yet run #1090

Open sbailey opened 3 years ago

sbailey commented 3 years ago

In the dashboard exposure rows, find a way to distinguish "has not yet run" from "ran and crashed and didn't produce the right output". This may require updates to the $DESI_SPECTRO_REDUX/$SPECPROD/processing_tables or other hints from the pipeline so that the dashboard knows what to do.

zkdtc commented 3 years ago

In the processing table, there is a STATUS flag indicating if a job is finished, running, or suspended. If a task is finished but files are not complete, then it is 'ran and crashed'. If the flag is running or suspended, then it is 'has not yet run'. How can we distinguish the two status? Using two colors for font? Bold?

sbailey commented 3 years ago

Suggestion for consideration: define a "known-missing.txt" file that could be put in any production directory listing filenames and reasons for output files that are known to be missing but aren't otherwise flagged as bad by the exposures table. The dashboard could use this to stop flagging missing files that we know won't be recovered in this production, and the file could also be simple human documentation about "why is file XXX missing?". We would use this for

  1. problems that are not going to be recovered in this prod, but we're not yet willing to completely discard the camera / petal / exposure for future prods
  2. handling the case where some petals can get through some steps but other petals can get further, which isn't supported by the current exposures table format. e.g. missing standards on some petals but not others. Longer term we should find a way to propagate this knowledge to future prods; this "known-missing.txt" suggestion isn't trying to address that.

e.g.

cframe-b0-00012345.fits  No stdstars on petal 0
cframe-r0-00012345.fits  No stdstars on petal 0
cframe-z0-00012345.fits  No stdstars on petal 0

@akremin @zkdtc others thoughts?