Running using SLURM - Githubissues

UltraNest version: 3.4.4
Python version: 3.8.10
Operating System: running in a singularity container

Description

i am trying to run ultranest on HPC using SLURM. i submit a sbatch script requesting the allocation and the actual invocation is done by a line in the script: srun singularity .... python run_ultranest.py ...

The job crashes with this log:

*** An error occurred in MPI_Init_thread                                                   
*** on a NULL communicator                                                                 
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,                   
***    and potentially your MPI job)                                                       
[hc201:95150] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!                                                                                         
srun: error: hc201: task 0: Exited with exit code 1                                        
srun: launch/slurm: _step_signal: Terminating StepId=664978.0

i am able to run the script on my computer both inside and outside the container and with mpirun.

What I Did

not sure what i can do.

thanks in advance

JohannesBuchner / UltraNest

Running using SLURM #65

Description

What I Did