ComputeCanada / software-stack-config

8 stars 3 forks source link

Disable ofi (libfabric) with Open MPI 4.0 on IB #31

Closed bartoldeman closed 2 years ago

bartoldeman commented 2 years ago

This was already done for 4.1 in commit b7cb7687 and avoids strange warnings when using mpirun 4.0 with Open MPI 4.1 - compiled executables (which is allowed).

mboisson commented 2 years ago

That will change the behavior of OMPI 4.0 in all cases though, no ?

bartoldeman commented 2 years ago

Yes but only on infiniband, not omnipath/ethernet. What happens on IB clusters normally is that ucx PML (Point-to-point management layer) is used and therefore the mtl's (used with the cm PML) and the btl's (used with the ob1 PML) are ignored. However, the plugins (e.g. $EBROOTOPENMPI/lib/openmpi/mca_mtl_ofi.so) are still loaded in memory and do some auto-detect initialization that sometimes produces warnings. Explicitly disabling them then avoids that and also slightly improves startup time.

The ofi btl doesn't exist for OpenMPI 4.0 but disabling it anyway doesn't hurt, Open MPI doesn't complain about it non-existence.

mboisson commented 2 years ago

ok. LGTM then. Feel free to merge.