Closed Sbte closed 1 year ago
it depends, the N+1 option is safe if you can do it..oversubscribe is an option, but as you say this could be bad for performance: you should be careful that the MPI is not using any CPU when idling (depends on the MPI for example openmpi has --mca mpi_yield_when_idle 1) then it should be ok.
If I want to run a code (in my case POP from OMUSE) on N cores (number_of_workers), say 128, it seems like I need to request N+1, so 129, tasks (in slurm), because AMUSE will call MPI.Spawn 128 times, but the original process already used 1 task, so that makes the total 129.
What's the proper way to solve this? Is it really not possible to reuse the original process for a worker?