Open robert-mijakovic opened 2 years ago
Yes, UCX support in NVSHMEM is not supported in QUDA yet. QUDA uses cmake and nvshmem doesn't so any usage requirements propagation is limited.
You should be able to specify additional linker flags using
CMAKE_EXE_LINKER_FLAGS
Thank you for the workaround. I have tested it and it worked well.
I'm compiling QUDA 1.1.0 using GCC 10.3.0., CUDA 11.3.1, OpenMPI, (external) Eigen, 3.3.9, and (external) NVSHMEM 2.4.1 on CentOS 8.4. The build is configured with:
Build fails in the linking phase with undefined symbols to ucp_*. NVSHMEM is compiled with the UCX transport layer.
NVSHMEM is built with:
The issue is that QUDA doesn't link against UCX,
-L$(UCX_HOME)/lib -lucs -lucp
.Looking into
common.mk
of NVSHMEM, I see that intention of NVIDIA is that codes that use it should link against UCX themselves, i.e., they expect QUDA to link against it.I would add the flags myself but CMakeLists.txt of QUDA doesn't provide such an option.