Open NicoMittenzwey opened 4 months ago
Might be related to Cgroup v2. This has been supported by hwloc since 2.2 but OMPI 4.1 seems to still have hwloc 2.0.
It's unlikely that we'll update the hwloc in Open MPI v4.1.x.
Your workaround is fine (use the system hwloc). You might also want to try bumping up to Open MPI v5.0.x (which will use the system-provided hwloc -- if available -- by default.
Thanks. Yes, actually we also installed Open MPI v5.0.2 in parallel. However, some applications run significant faster using HCOLL but we ran into #10718 with Open MPI v5.
We also try to stick with vendor optimized environments for support reasons and Nvidia HPC-X ships with Open MPI 4.1 using the internal hwloc. So this issue also serves as a documentation of our findings in the hopes, search engines will index it and others don't have to investigate for hours to find the root cause.
Not sure if it's exactly the same issue, but I'm hitting the same error on our cluster with this configuration:
If there are other jobs running on the node I end up on, I get:
--------------------------------------------------------------------------
Open MPI tried to bind a new process, but something went wrong. The
process was killed without launching the target application. Your job
will now abort.
Local host: <redacted>
Application name: /usr/bin/hostname
Error message: hwloc_set_cpubind returned "Error" for bitmap "0"
Location: rtc_hwloc.c:382
--------------------------------------------------------------------------
This happens even if I use --ntasks-per-node
.
A workaround is to use srun
instead of mpirun
, or use mpirun --bind-to none
.
System
AlmaLinux 9.3 OpenMPI 4.1.7 out of HPCX 2.18.0 Nvidia Infiniband NDR Slurm 23.11
Issue
We are running Slurm 23.11 on Alma Linux 9.3 with
TaskPlugin=task/affinity,task/cgroup
and OpenMPI 4.1.7 from Mellanox / Nvidia HPC-X 2.18.0. When starting jobs with less then the maximum number of processes per node and NOT defining--ntasks-per-node
OpenMPI 4.1.7 will crash as it is trying to bind process to cores which are not available to it:Workaround
Recompiling OpenMPI and forcing it to use system hwloc resolves this issue (might need a
dnf install hwloc-devel
):./configure [...] --with-hwloc=/usr/ && make && make install