iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.56k stars 571 forks source link

Building `samples/custom_dispatch/cuda/kernels` failed with 'error: cannot find CUDA installation' #17815

Open qzylalala opened 2 months ago

qzylalala commented 2 months ago

What happened?

Hi, I'm new to IREE, and I wanna build from source following documents here. However, when I try to build using this command cmake --build ../iree-build/, I got an error:

[6/602] Compiling ukernel.cu to cuda_ukernel_ukernel.cu.bc
FAILED: samples/custom_dispatch/cuda/kernels/cuda_ukernel_ukernel.cu.bc /work_space/iree-build/samples/custom_dispatch/cuda/kernels/cuda_ukernel_ukernel.cu.bc 
cd /work_space/iree-build/samples/custom_dispatch/cuda/kernels && /work_space/iree-build/llvm-project/bin/clang-19 -x cuda --cuda-gpu-arch=sm_60 --cuda-path=/usr/local/cuda-12.3/bin -Wno-unknown-cuda-version -nocudalib --cuda-device-only -D_ALLOW_COMPILER_AND_STL_VERSION_MISMATCH -O3 -c -emit-llvm /work_space/iree/samples/custom_dispatch/cuda/kernels/ukernel.cu -o cuda_ukernel_ukernel.cu.bc
clang-19: error: cannot find CUDA installation; provide its path via '--cuda-path', or pass '-nocudainc' to build without CUDA includes
[95/602] Linking CXX static library lib/libiree_compiler_plugins_input_Torch_torch-mlir_TorchDialectIR.a

It's so strange that it tell me cannot find CUDA installation; provide its path via '--cuda-path', but as you can see, we actually get --cuda-path=/usr/local/cuda-12.3/bin. Any help will be welcomed, thx!

Steps to reproduce your issue

Just follow the guide to build from source, everything works well before build cmake --build ../iree-build/.

What component(s) does this issue relate to?

Other

Version information

IREE: main branch

Additional context

No response

ScottTodd commented 2 months ago

Haven't seen that error before. Could be some version incompatibility in the CUDA SDK or the local clang build. The logic in https://github.com/iree-org/iree/blob/main/samples/custom_dispatch/cuda/kernels/CMakeLists.txt is also rather complex so there might be bugs in there.

If you want to work past that error, you can either:

benvanik commented 2 months ago

--cuda-path= should not point to bin/, but the root cuda SDK directory

see https://llvm.org/docs/CompileCudaWithLLVM.html

When compiling, you may also need to pass --cuda-path=/path/to/cuda if you didn’t install the CUDA SDK into /usr/local/cuda or /usr/local/cuda-X.Y.

(if all that looks right on your side, you may as scott points out be hitting a version incompatibility between clang and your installed SDK - or not have a full SDK install)

qzylalala commented 2 months ago

--cuda-path= should not point to bin/, but the root cuda SDK directory

see https://llvm.org/docs/CompileCudaWithLLVM.html

When compiling, you may also need to pass --cuda-path=/path/to/cuda if you didn’t install the CUDA SDK into /usr/local/cuda or /usr/local/cuda-X.Y.

(if all that looks right on your side, you may as scott points out be hitting a version incompatibility between clang and your installed SDK - or not have a full SDK install)

Hi, thanks for your reply and it helps a lot. I find that the reason of such strange behavior is about cmake version. As we configure cuda toolkit here: https://github.com/iree-org/iree/blob/3f6bf8c2e8f3c14a229d1a631e2f0f7e6b25cf15/build_tools/third_party/cuda/CMakeLists.txt#L45-L49 I use cmake==3.22.1 and just add one message command to take a look of what's wrong here.

if(CUDAToolkit_FOUND)
      # Found on the system somewhere, no need to install our own copy.
      message(STATUS "Using found CUDAToolkit_BIN_DIR: ${CUDAToolkit_BIN_DIR}")
      cmake_path(GET CUDAToolkit_BIN_DIR PARENT_PATH CUDAToolkit_ROOT)
      set(CUDAToolkit_ROOT "${CUDAToolkit_ROOT}" PARENT_SCOPE)
      message(STATUS "Using found CUDA toolkit: ${CUDAToolkit_ROOT}")

The output shows that cmake_path() don't work.

-- Using found CUDAToolkit_BIN_DIR: /usr/local/cuda-12.3/bin
-- Using found CUDA toolkit: /usr/local/cuda-12.3/bin

I try different versions of cmake, and 3.23.3 or newer is OK. But I don't know why 3.22.1 don't work, as cmake_path with a PARENT_PATH option is supported for cmake 3.20 and newer 😂.