Open sbailey opened 5 years ago
Several comments:
The directory run/scripts/night
is reserved for use by the desi_night
script. It places all slurm scripts and job logs for each night in the corresponding night directory.
In addition to the standard nightly jobs (which are likely run by DTS triggers or other automation), a user might run arbitrary jobs spanning arbitrary pipeline steps and combinations of nights. The slurm scripts and auxilliary files for each of these manual pipeline jobs are placed in date-stamped directories in run/scripts
.
I think it is very important to separate the job logs of the nightly processing from the other bucket of arbitrary jobs a user might run. The reason for not including the nights in these manual job scripts is that the list of nights does not need to be contiguous (unlike the range of pipeline steps packed into a job, which is contiguous). The desi_pipe go
command is only one way to run pipeline jobs and it happens to break up the jobs per night in the same way as desi_night
. However, this will not be true for users running desi_pipe chain
to manually do some stuff.
I think a better solution to "where is the job log for the job that did this task?" is a combination of restored functionality of desi_pipe status
(to "drill down") into tasks using the DB, and the job table which is coming.
I find it confusing that
desi_pipe go
creates jobs grouped by night, but the output directory structures and filenames are named after the time they were submitted rather than the night they are processing. Additionally, it creates a set ofscripts/night/YEARMMDD
directories for the nights that it is processing, but those are blank.I realize that jobs in general can span nights, but the current structure makes it difficult to track down which job and logs correspond to failures for data from a given NIGHT/EXPID. Although a job table would help somewhat, it would still seem (to me) unnecessarily complicated to have to query the DB or grep a bunch of *.tasks files to find which job went with which night.
If we name the jobs after the night they are processing, this particular case becomes much easier, but we lose flexibility in how tasks can be grouped into jobs. Maybe that's ok. Any other ideas?
Example structure created by
desi_pipe go
for a week of data:The
night/
subdirectory has seven blankYEARMMDD
.