LLNL / RAJA

RAJA Performance Portability Layer (C++)
BSD 3-Clause "New" or "Revised" License
484 stars 102 forks source link

OpenMP+CUDA leads to "nvlink fatal : unexpected object after cudadevrt" #672

Closed jeffhammond closed 3 years ago

jeffhammond commented 5 years ago

I'm trying to build RAJA on a POWER9 + V100 system with OpenMP host and target plus CUDA enabled at the same time.

It works fine when I disable OpenMP host and target

I am using the develop branch updated this morning.

Compiler Versions

NVCC

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Apr_24_19:12:21_PDT_2019
Cuda compilation tools, release 10.1, V10.1.168

XLC

$ xlc++_r -qversion=verbose
IBM XL C/C++ for Linux, V16.1.1 (Community Edition)
Version: 16.01.0001.0003
Driver Version: 16.1.1(C/C++) Level: 190404 ID: _RXbulUR3EemRhIlXaqgrRQ
C/C++ Front End Version: 16.1.1(C/C++) Level: 190404 ID: _5zS_JFEJEemRhYlXaqgrRQ
High-Level Optimizer Version: 16.1.1(C/C++) and 16.1.1(Fortran) Level: 190404 ID: _8qy2aFcMEemRholXaqgrRQ
Low-Level Optimizer Version: 16.1.1(C/C++) and 16.1.1(Fortran) Level: 190404 ID: _AO0twiBaEemAt6l22ZCkwQ
Intermediate Language Splitter Version: 16.1.1(C/C++) and 16.1.1(Fortran) Level 190404 ID: _YjjuMKTuEeitLMuu6VxByg
W-Code to LLVM-IR Translator: 16.1.1(C/C++) and 16.1.1(Fortran) Level 190404 ID: _k40U8k93EemRhYlXaqgrRQ
NVVM-IR to PTX Translator: 16.1.1(C/C++) and 16.1.1(Fortran) Level 190404 ID: _aZtcAtk6EeiLR71RxbOxBQ
/opt/ibm/xlC/16.1.1/bin/.orig/xlc++_r: note: XL C/C++ Community Edition is a no-charge product and does not include official IBM support. You can provide feedback at the XL on POWER C/C++ Community Edition forum (http://ibm.biz/xlcpp-linux-ce). For information about a fully supported XL C/C++ compiler, visit XL C/C++ for Linux (http://ibm.biz/xlcpp-linux).

CMake Invocation

$ cmake .. -DRAJA_ENABLE_CUDA=True  -DCMAKE_INSTALL_PREFIX=/home/jrhammon/RAJA/install-cuda  -DCMAKE_CXX_COMPILER=xlc++_r -DCMAKE_C_COMPILER=xlc_r -DENABLE_OPENMP=On -DENABLE_TARGET_OPENMP=On -DOpenMP_CXX_FLAGS="-qsmp -qoffload" -DENABLE_CUDA=On -DCUDA_ARCH=sm_70

NVCC Link Error

[  5%] Linking CUDA executable ../../../tests/blt_cuda_openmp_smoke
cd /home/jrhammon/RAJA/git/build-cmake/blt/tests/smoke && /opt/crtdc/cmake/3.14.5-xlc/bin/cmake -E cmake_link_script CMakeFiles/blt_cuda_openmp_smoke.dir/link.txt --verbose=1
/usr/bin/xlc++_r -qsmp -qoffload CMakeFiles/blt_cuda_openmp_smoke.dir/blt_cuda_openmp_smoke.cpp.o CMakeFiles/blt_cuda_openmp_smoke.dir/cmake_device_link.o -o ../../../tests/blt_cuda_openmp_smoke /usr/local/cuda-10.1/lib64/libcudart_static.a -ldl /usr/lib64/librt.so  -L"/usr/local/cuda-10.1/targets/ppc64le-linux/lib/stubs" -L"/usr/local/cuda-10.1/targets/ppc64le-linux/lib" -lcudadevrt -lcudart_static -lrt -lpthread -ldl
nvlink fatal   : unexpected object after cudadevrt
make[2]: *** [tests/blt_cuda_openmp_smoke] Error 255
jeffhammond commented 5 years ago

If CUDA and OpenMP are not mutually supported with some toolchain, CMake should detect the conflict and inform the user.

jeffhammond commented 4 years ago

I'm still seeing this on LASSEN.

mclarsen commented 4 years ago

Figure this out?

jeffhammond commented 4 years ago

Not that I remember but October 2019 was a century ago.

mclarsen commented 4 years ago

I think i identified the issue. The link line had /usr/tce/packages/cuda/cuda-10.1.243/lib64/libcudadevrt.a on it, and it appears that nvcc is complaining about several more -lsomethings after it. Since this was a make-based build system, it was fairly easy to hack myself to victory. Your mileage may vary.

This was also on lassen.

DavidPoliakoff commented 4 years ago

Clearly what Jeff is suggesting is that he wants RAJA to have an nvcc_wrapper, he really loves portability models that add those

On Wed, Sep 2, 2020 at 7:58 PM Matt Larsen notifications@github.com wrote:

I think i identified the issue. The link line had /usr/tce/packages/cuda/cuda-10.1.243/lib64/libcudadevrt.a on it, and it appears that nvcc is complaining about several more -lsomethings after it. Since this was a make-based build system, it was fairly easy to hack myself to victory. Your mileage may vary.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LLNL/RAJA/issues/672#issuecomment-686186153, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAGLSLY6DLTDQ5WQJOFIOTSD3Z6FANCNFSM4JCVK4NQ .

-- Thanks,

David

jeffhammond commented 4 years ago

have an nvcc_wrapper, he really loves portability models that add those

I would rather have a CMake option to compile CUDA Clang from source so I could use a real CUDA compiler that respects the value of human dignity.

mclarsen commented 4 years ago
nvlink fatal   : we can't handle any libraries after cudadevrt. Please use nvlink better.