desihub / desispec

DESI spectral pipeline
BSD 3-Clause "New" or "Revised" License
35 stars 24 forks source link

optionally parallelize group_spectra #2291

Closed sbailey closed 1 month ago

sbailey commented 1 month ago

This PR adds optional MPI parallelization to desi_group_spectra. I have not yet integrated this into desi_zproc because that will require more plumbing on the zproc side, but at minimum this provides a standalone way to generate large healpix if needed so I'm opening this as an independent PR and will do the zproc integration in a followup PR.

e.g. one of our largest most problematic healpix in jura was special other 27258 with over 4000 input frame files. This branch can group those spectra in 11 minutes with

HPIX=27258
time srun -n 64 -c 4 desi_group_spectra --mpi --healpix $HPIX \
  --expfile $DESI_SPECTRO_REDUX/$SPECPROD/healpix/special/other/272/$HPIX/hpixexp-special-other-$HPIX.csv \
  -o $SCRATCH/temp/spectra-special-other-$HPIX.fits -c $SCRATCH/temp/coadd-special-other-$HPIX.fits

The equivalent took hours in jura with the serial code in main. FYI srun -n 128 -c 2 ... also works without blowing memory, but isn't actually any faster (likely saturating node I/O).

On smaller healpix (e.g. sv1 dark 7015) I also verified that this still works in non-MPI mode and that the outputs are data-identical in both cases (header timestamps and dependency versions differ, ok).

Most of the changes are for MPI boilerplate "if rank == 0" sort of stuff, plus factoring out the frame file reading + filtering into a separate function so that it can be parallelized using

with MPICommExecutor(comm, root=0) as pool:
    frames = list(pool.starmap(_read_framefile, read_args))

This pattern could also be used to parallelize with multiprocessing if needed, but I didn't add that option.