intel / mpi-benchmarks

145 stars 63 forks source link

Hangs in some IMB-RMA tests when I_MPI_ROOT is set and when using a non-One API-provided libfabric #48

Open charlesshereda opened 11 months ago

charlesshereda commented 11 months ago

The IMB-RMA tests Accumulate, Get_accumulate, Fetch_and_op, and Compare_and_swap all hang for us if we use a libfabric version other than the one provided with OneAPI.

If, however, we unset the env var I_MPI_ROOT, all the tests complete normally.

We discovered this because the script which we source for OneAPI MPI is setting that env var:

source /opt/intel/oneapi/mpi/latest/env/vars.sh release

JuliaRS commented 5 months ago

@charlesshereda which libfabric version and oneAPI did you try to use ?

charlesshereda commented 1 week ago

@JuliaRS Sorry, I never saw the notification for your communication and wasn't actively checking this issue. The libfabric version was a custom build of ours; we're developing the OPX provider so you should be able to see it from the latest release of libfabric (e.g., 1.21).

At the time that we discovered this we were testing with IMPI 2021.9.0 - so whatever oneAPI that includes that IMPI version. We haven't tried with the latest IMPI, which I understand just came out in June.