When Cray compiler wrappers are used, CMake delegates MPI inclusion to the wrapper and does not propagate the explicit headers / flags to the compilers. This is a problem for nvcc as it does not receive the necessary header includes. Fatal symptom is as follows when compiling CUDA objects:
The solution we came up with in GauXC was to insulate CUDA kernels from MPI headers, as this is generally always possible because CUDA API functions / GPU-direct MPI bindings are accessible via the CUDA::cudart TARGET, and can be included purely in C++ code w/o being compiled by nvcc.
Unfortunately, there is no robust way to convince CMake to pass includes / flags in this situation (MPI_ASSUME_NO_BUILTIN_MPI and MPI_SKIP_COMPILER_WRAPPER don't always work as expected)
When Cray compiler wrappers are used, CMake delegates MPI inclusion to the wrapper and does not propagate the explicit headers / flags to the compilers. This is a problem for
nvcc
as it does not receive the necessary header includes. Fatal symptom is as follows when compiling CUDA objects:The solution we came up with in
GauXC
was to insulate CUDA kernels from MPI headers, as this is generally always possible because CUDA API functions / GPU-direct MPI bindings are accessible via theCUDA::cudart
TARGET, and can be included purely in C++ code w/o being compiled bynvcc
.Unfortunately, there is no robust way to convince CMake to pass includes / flags in this situation (
MPI_ASSUME_NO_BUILTIN_MPI
andMPI_SKIP_COMPILER_WRAPPER
don't always work as expected)