desihub / desispec

DESI spectral pipeline
BSD 3-Clause "New" or "Revised" License
36 stars 24 forks source link

pipeline script/ directory organization #800

Open sbailey opened 5 years ago

sbailey commented 5 years ago

I find it confusing that desi_pipe go creates jobs grouped by night, but the output directory structures and filenames are named after the time they were submitted rather than the night they are processing. Additionally, it creates a set of scripts/night/YEARMMDD directories for the nights that it is processing, but those are blank.

I realize that jobs in general can span nights, but the current structure makes it difficult to track down which job and logs correspond to failures for data from a given NIGHT/EXPID. Although a job table would help somewhat, it would still seem (to me) unnecessarily complicated to have to query the DB or grep a bunch of *.tasks files to find which job went with which night.

If we name the jobs after the night they are processing, this particular case becomes much easier, but we lose flexibility in how tasks can be grouped into jobs. Maybe that's ok. Any other ideas?

Example structure created by desi_pipe go for a week of data:

fiberflat-cframe_20190719-222451/
fiberflat-cframe_20190719-222527/
fiberflat-cframe_20190719-222602/
fiberflat-cframe_20190719-222638/
fiberflat-cframe_20190719-222654/
fiberflat-cframe_20190719-222711/
fiberflat-fiberflatnight_20190719-222617/
night/
preproc-psfnight_20190719-222437/
preproc-psfnight_20190719-222502/
preproc-psfnight_20190719-222538/
preproc-psfnight_20190719-222607/
preproc-psfnight_20190719-222623/
preproc-psfnight_20190719-222643/
preproc-psfnight_20190719-222659/
spectra-redshift_20190719-222728/
traceshift-extract_20190719-222447/
traceshift-extract_20190719-222520/
traceshift-extract_20190719-222544/
traceshift-extract_20190719-222613/
traceshift-extract_20190719-222628/
traceshift-extract_20190719-222650/
traceshift-extract_20190719-222706/

The night/ subdirectory has seven blank YEARMMDD .

tskisner commented 5 years ago

Several comments:

  1. The directory run/scripts/night is reserved for use by the desi_night script. It places all slurm scripts and job logs for each night in the corresponding night directory.

  2. In addition to the standard nightly jobs (which are likely run by DTS triggers or other automation), a user might run arbitrary jobs spanning arbitrary pipeline steps and combinations of nights. The slurm scripts and auxilliary files for each of these manual pipeline jobs are placed in date-stamped directories in run/scripts.

I think it is very important to separate the job logs of the nightly processing from the other bucket of arbitrary jobs a user might run. The reason for not including the nights in these manual job scripts is that the list of nights does not need to be contiguous (unlike the range of pipeline steps packed into a job, which is contiguous). The desi_pipe go command is only one way to run pipeline jobs and it happens to break up the jobs per night in the same way as desi_night. However, this will not be true for users running desi_pipe chain to manually do some stuff.

I think a better solution to "where is the job log for the job that did this task?" is a combination of restored functionality of desi_pipe status (to "drill down") into tasks using the DB, and the job table which is coming.