Open jordiblasco opened 5 years ago
@akesandgren Thoughts on this one?
The CUDA problem above is that you haven't installed the runtime libraries for CUDA, they come from the OS installed nvidia packages. On Ubuntu it's called "libcuda1-418" or whatever number of the nvidia driver is currently being used.
I.e. to be able to run the tests when building FFTW with gompic you need the CUDA runtime libraries installed or suffer the above warning message from OpenMPI. Or explicitly set OMPI_MCA_mpi_cuda_support=0 in the environment before doing it.
As for the OpenMPI problem then yes, if using the internal PMIx in OpenMPI you might suffer from that bug.
We (at HPC2N) always use an external PMIx to have better control of which version is in use, the same goes for UCX.
The tests of FFTW-3.3.8-gompic-2018b.eb are not capable of finding libcuda when compiling it with CUDA-aware OpenMPI. Tested in CentOS Linux release 7.6.1810.
It seems related to the following two issues:
I guess that mpirun needs -x LD_LIBRARY_PATH as it tries to find the libraries in /usr/lib64.
Also, this particular release of OpenMPI is affected by this bug: https://github.com/open-mpi/ompi/issues/5336