desihub / desispec

DESI spectral pipeline
BSD 3-Clause "New" or "Revised" License
33 stars 24 forks source link

add reservation parsing for cpu vs. gpu #2235

Closed sbailey closed 2 months ago

sbailey commented 2 months ago

This PR adds support for specifying separate reservation names for CPU and GPU reservations, with either of them optionally being None, e.g. if we had a CPU reservation but not a GPU reservation.

Examples in $CFS/desi/users/sjbailey/spectro/redux/j1

desi_proc_night -n 20240409 --dry-run-level 3 --reservation blat &> submit-20240409-singlereservation.log
desi_proc_night -n 20240409 --dry-run-level 3 --reservation blat_cpu,foo_gpu &> submit-20240409-dualreservations.log
desi_proc_night -n 20240409 --dry-run-level 3 --reservation blat_cpu,None &> submit-20240409-cpuonlyreservation.log

grep for "sbatch" in each of those log files:

This also works for desi_resubmit_queue_failures since that uses the same submit_batch_script function. Yay! I tested this with the ongoing j1 production:

desi_resubmit_queue_failures -n 20240403 --dry-run --reservation blat_cpu,foo_gpu > $SCRATCH/resubmit-test.log
grep sbatch /pscratch/sd/d/desi/resubmit-test.log

--> The 4 ztile jobs to resubmit had --reservation=foo_gpu

Implementation note: I opted to not enforce _cpu/_gpu endings to reservation names, which gives us more flexibility for reservation names in the future and makes it easier to support having a reservation for one but not the other.

@akremin please check

sbailey commented 2 months ago

@akremin please also update to move nightlyflat jobs to GPU nodes, and then merge when ready. Let's test this on j1 while we still have the juratest2_cpu and juratest2_gpu reservations (EDITED to correct reservation names)