Closed simonpintarelli closed 8 months ago
FYI, @msimberg noticed performance degradation for DLA-Future on LUMI compared to Alps and narrowed down the issue to LUMI using XPMEM for intra-node communication.
Workaround is to either avoid linking with xpmem (in which case MPI will use it automatically) or explicitly request the fallback CMA mode with
MPICH_SMP_SINGLE_COPY_MODE=CMA
.
He might have some additional details.
Would it make sense to make this a variant? I'd be 100% ok even with having it on by default, but it'd give us the option to disable it already in the spack recipe if we find that it causes a degradation also on clariden. I think this PR is also important to get people to test what xpmem does to their application performance.
That is to say: no objections at all to merging this.
HPE and LUMI engineers are aware that we see a degradation with DLA-Future, and have offered to look into the problem.
@msimberg thanks for the input. Didn't know about the LUMI case. CPE also links xpmem
, we missed this previously. It can be selected at runtime using MPICH_SMP_SINGLE_COPY_MODE
. I think it's not necessary to make it a variant.
On clariden (I tested eiger), it's the reverse, performance for osu_bw much better if xpmem is linked (and used by default).
use
patchelf --add-needed
to link libxpmem.so to mpi libraries.