Open raffaelepotami opened 4 years ago
Update: starting the batch job with
mpiexec -np 1 R --no-save --file=foo.R
instead of R CMD BATCH
or Rscript
seems to work.
The execution still ends with a bad OMPI since the task just dies out there, but at least it does run the hostname on the distributed system
Can you try using the BiocParallel::BatchToolsParam()
interface and try it on your SLURM cluster?
Hello Everyone, We are having trouble running BiocParallel within our SLURM cluster environment.
The foo.R script we are trying to run is
If we request an interactive job allocation, for example with
salloc -p mpi -N 2 -n 4 -t 1:00:00
and then start R with:mpiexec -np 1 R --no-save
and run the above script from this interactive shell we have as expected:However if we try to run the same R script from within a sbatch job with:
The execution hangs for several seconds and eventually fails with the MPI error:
Does anyone have any idea of why the primary R process is failing to start the other tasks?
Thank you Raffaele