When using cuda_dep = dependency('cuda') with another language (such as C or C++), the dependency lacked the correct -L<CUDA_ROOT>/lib flag, which lead to broken links:
/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lcudart_static: No such file or directory
/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lcublas: No such file or directory
Currently, Meson cannot produce valid CUDA binaries when multiple .cu files rely on separate compilation/device linking. Device linking implies that the final SASS/machine code is generated at link time, since no one .cu file contains all code necessary to build the final kernel code. The fix is somewhat painful, because we cannot pass static libraries to nvcc using -Xlinker, which requires more conditional code. Passing libraries using -Xlinker=libfoo.a would pass libfoo.a straight to the linker without nvcc being able to perform device linking. Unfortunately, removing -Xlinker= triggers the bug mentioned in https://github.com/mesonbuild/meson/issues/9479#issuecomment-953485040, which requires more conditional disabling of thin archives.
This PR fixes two issues:
When using
cuda_dep = dependency('cuda')
with another language (such as C or C++), the dependency lacked the correct-L<CUDA_ROOT>/lib
flag, which lead to broken links:Currently, Meson cannot produce valid CUDA binaries when multiple
.cu
files rely on separate compilation/device linking. Device linking implies that the final SASS/machine code is generated at link time, since no one.cu
file contains all code necessary to build the final kernel code. The fix is somewhat painful, because we cannot pass static libraries tonvcc
using-Xlinker
, which requires more conditional code. Passing libraries using-Xlinker=libfoo.a
would passlibfoo.a
straight to the linker withoutnvcc
being able to perform device linking. Unfortunately, removing-Xlinker=
triggers the bug mentioned in https://github.com/mesonbuild/meson/issues/9479#issuecomment-953485040, which requires more conditional disabling of thin archives.