Open ickc opened 7 months ago
The provided wrapper script to launch multi-node OpenMPI job in /opt/simonsobservatory/cbatch_openmpi does not handle MPI abortion correctly.
/opt/simonsobservatory/cbatch_openmpi
Symptom: Job hangs after MPI abortion, leaving node idle in the queue
TODO:
The provided wrapper script to launch multi-node OpenMPI job in
/opt/simonsobservatory/cbatch_openmpi
does not handle MPI abortion correctly.Symptom: Job hangs after MPI abortion, leaving node idle in the queue
TODO: