open-mpi / ompi

Open MPI main development repository
https://www.open-mpi.org
Other
2.16k stars 859 forks source link

Expand CUDA support and fix documentation to account for all cuda dependent components. #12279

Open christgau opened 9 months ago

christgau commented 9 months ago

Background information

What version of Open MPI are you using?

v5.0.1

Describe how Open MPI was installed

Open MPI was installed from Github release tarball. Configuration was done using this command line:

../configure \
        --prefix="${prefix_dir}" \
        --without-psm2 \
        --without-ofi \
        --with-lustre \
        --with-slurm \
        --with-pmix \
        --with-ucx="${UCX_DIR}" \
        --with-cuda="${CUDA_ROOT}" \
        --with-cuda-libdir="${CUDA_ROOT}/lib64/stubs" \
        --enable-mca-dso=btl-smcuda,rcache-rgpusm,rcache-gpusm,accelerator-cuda,coll-cuda

Note that I added coll-cuda to the list of mca-dsos. I'm not sure if it is intentionally missing in the documentation. I also tried without coll-cuda first, but with the same outcome.

CUDA Toolkit version 12.3 was installed in CUDA_ROOT. UCX was built against that CUDA toolkit. On cluster nodes with the drivers installed, ucx_info -d reports the relevant CUDA and gdrcopy transports.

Remark: The host used for compilation has the CUDA toolkit and runtime installed, but not the driver. So using stubs appears to be the way to go in that case (see #12264)

Please describe the system on which you are running


Details of the problem

With Open MPI 4.1.4,I was able to build it such that one could compile and run binaries without the need of having the CUDA toolkit, runtime and drivers available on the node in use. However, with 5.0.1 configured as shown above, the linker warns about missing libcudart when building a binary (even a basic MPI_Init/MPI_Finalize program):

#include <stdio.h>
#include <stdlib.h>

#include "mpi.h"

int main(int argc, char* argv[])
{
        MPI_Init(&argc, &argv);
        MPI_Finalize();

        return EXIT_SUCCESS;
}
$ mpicc -show hw.c -o hw
gcc hw.c -o hw -I/path/to/openmpi/include -pthread -L/path/to/openmpi/lib -Wl,-rpath -Wl,/path/to/openmpi/lib -Wl,--enable-new-dtags -lmpi
$ mpicc hw.c -o hw
/usr/bin/ld: warning: libcudart.so.12, needed by /path/to/openmpi/lib/libmpi.so, not found (try using -rpath or -rpath-link)
$ mpirun -n1 ./hw
./hw: error while loading shared libraries: libcudart.so.12: cannot open shared object file: No such file or directory
$ ldd hw
        linux-vdso.so.1 (0x00007ffc747da000)
        libmpi.so.40 => /path/to/openmpi/lib/libmpi.so.40 (0x000014ae23df9000)
        [...]
        libcudart.so.12 => not found

With 4.1.4 I am able to compile and launch without those warnings/errors while having a CUDA-aware MPI. For 4.1.4 it was not the case that libmpi depends on libcudart, although 4.1.4 was configured using --with-cuda=....

If I got the SC'23 BoF slides correct, I understand that with 5.x Open MPI intends to integrate (link?) plugins directly into libmpi. But with the enable-mca-dso configure option I tried to put all CUDA related components into DSOs and thus away from libmpi. Nevertheless, libmpi has libcudart as a shared library dependency (see above). I also checked the symbols which libmpi needs but it does not appear to require any stuff from libcudart:

$ nm -D /path/to/openmpi/lib/libmpi.so.40 | grep -i cuda
000000000029cdb0 T mca_pml_ob1_rdma_cuda_btls
00000000002c7e20 T MPIX_Query_cuda_support
                 U opal_built_with_cuda_support
                 U opal_cuda_support

So it appears to me that libmpi unnecessarily depends on libcudart. Is there some bug in the configure/compilation process or is it not possible anymore to build Open MPI libraries such that one can compile applications without CUDA runtime libraries being available? Given the dependency to libcudart of libmpi the statement from the documentation

Open MPI supports building with CUDA libraries and running on systems without CUDA libraries or hardware.

does not appear to apply here. Or is there something wrong on my side?

Btw: The test program from the documentation may also deserve a call to MPI_Init in case one follows the DSO approach. Otherwise, it is reported that there is no CUDA support (using OMPI v5.0.1 with CUDA toolkit 12.3 available for compilation/execution):

$ ./check  # with MPI_Init
Compile time check:
This MPI library has CUDA-aware support.
Run time check:
This MPI library has CUDA-aware support.
$ ./check-no-init # without MPI_Init
Compile time check:
This MPI library has CUDA-aware support.
Run time check:
This MPI library does not have CUDA-aware support.
janjust commented 9 months ago

@christgau I think you're slurping in another component when configuring/building OMPI. In my case I also had to build io-romio341 component as a dso.

--enable-mca-dso=btl-smcuda,rcache-rgpusm,rcache-gpusm,accelerator-cuda,coll-cuda,io-romio341

Can you try it, please?

christgau commented 9 months ago

@janjust Thanks for your input. I can confirm that adding io-romio341 to the list of MCA DSOs removes the dependency on libcudart from libmpi.

I'm not sure how obvious this is to others, so I suggest to add the full list (see above?!) to the documentation.

Besides that, with Open MPI build like that the check code falsely reports that CUDA support is missing without MPI_Init - even on a node with CUDA runtime/driver installed. Having added the initial MPI call, everything works as expected:

non-gpu-node $ ./check-with-init
Compile time check:
This MPI library has CUDA-aware support.
[non-gpu-node:2426494] mca_base_component_repository_open: unable to open mca_accelerator_cuda: libcuda.so.1: cannot open shared object file: No such file or directory (ignored)
[non-gpu-node:2426494] mca_base_component_repository_open: unable to open mca_rcache_rgpusm: libcuda.so.1: cannot open shared object file: No such file or directory (ignored)
[non-gpu-node:2426494] mca_base_component_repository_open: unable to open mca_rcache_gpusm: libcuda.so.1: cannot open shared object file: No such file or directory (ignored)
[non-gpu-node:2426494] mca_base_component_repository_open: unable to open mca_btl_smcuda: libcuda.so.1: cannot open shared object file: No such file or directory (ignored)
Run time check:
This MPI library does not have CUDA-aware support.

gpu-node $ ./check-with-init
Compile time check:
This MPI library has CUDA-aware support.
Run time check:
This MPI library has CUDA-aware support.

gpu-node $ ./check-no-init
Compile time check:
This MPI library has CUDA-aware support.
Run time check:
This MPI library does not have CUDA-aware support.

Maybe docs should be updated accordingly as well.

janjust commented 9 months ago

I agree, in the meantime I'll make a feature issue request out of this.