Open migueldiascosta opened 1 year ago
@Micket @bartoldeman thoughts on this?
@migueldiascosta there is no included HPL.dat here i think?
@Micket I meant the HPL.dat included in the HPL installation, but it doesn't really matter, I just mentioned HPL as a ready to use MPI program that could be used to debug the btl selection with -mca btl_base_verbose 100
(and not as a benchmark)
Now, one reason this may not be so widespread is that the btl's are only used with the ob1
pml, they are bypassed completely when using the ucx
pml (e.g., slide 33 and onward of https://www.open-mpi.org/video/general/easybuild_tech_talks_01_OpenMPI_part2_20200708.pdf, these "easybuild tech talks" are really useful :) ), so in order to check if smcuda
is being selected when ob1
is used it may be necessary to also pass -mca pml ob1
.
When the ob1
pml is used and btl_base_verbose
is set and I use foss >= 2021a, I see
mca: bml: Using smcuda btl for send to ... on node ...
but if I use foss/2020b, I see
mca: bml: Using vader btl for send to ... on node ...
so it does seem this is a side-effect of building with --with-cuda=internal
(?)
@bartoldeman Any thoughts on this? Should we try and prevent that smcuda
is used on non-GPU systems?
With the OpenMPI included in version
foss/2022a
, I'm seeing thesmcuda
btl being selected on non-GPU nodes, with an adverse impact on performance compared tovader
I initially saw this when running
perf top
, but it can also be checked by e.g. passing-mca btl_base_verbose 100
tompirun
(or setting the corresponding environment variable) and looking for "Using smcuda"Does anyone else see this behaviour?
Edit: seen this on a few different non-GPU systems now - simplest way to check is probably to load
HPL
and, with the includedHPL.dat
file, runmpirun -np 4 --mca btl_base_verbose 100 xhpl 2>&1 | grep Using
Edit 2: I'm working around the issue by setting the environment variable
OMPI_MCA_btl=^smcuda
in non-GPU nodes, but if this happens to other people we need a better solution