nv-legate / legate.core

The Foundation for All Legate Libraries
https://docs.nvidia.com/legate/24.06/
Apache License 2.0
186 stars 61 forks source link

Allow plugging in to user's MPI install #953

Open manopapad opened 4 weeks ago

manopapad commented 4 weeks ago

Opening this issue on behalf of @tylerjereddy

Tyler would like to swap in openmpi from his supercomputer's module. With the legate packages using conda's openmpi he is seeing a conflict between UCX versions provided by the conda env vs. what the module expects:

1723679300.245558] [cn1:3244158:0]     ucp_context.c:2190 UCX  WARN  UCP API version is incompatible: required >= 1.18, actual 1.17.0 (loaded from /path/to/libucp.so.0)

In general, we are trying to remove all uses of MPI from the Legate codebase, but we will likely not be able to avoid it for at least GASNet bootstrapping (necessary for Slingshot11).

We are working on a shim that can be compiled on the user's machine, to plug in to their local MPI, that might be necessary on HPC clusters.