ggerganov / llama.cpp

LLM inference in C/C++
MIT License
67.88k stars 9.73k forks source link

Compilation issue for CUDA #6350

Closed freeone3000 closed 7 months ago

freeone3000 commented 7 months ago

System Information

uname -a: Linux mumei 6.1.0-17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30) x86_64 GNU/Linux gcc --version: gcc (Debian 12.2.0-14) 12.2.0 g++ --version: g++ (Debian 12.2.0-14) 12.2.0 nvcc --version:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

Card: 2x GeForce 3060

nvidia-smi:

jasmine@mumei ~ $ nvidia-smi
Wed Mar 27 12:59:53 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
|  0%   54C    P8    16W / 170W |   7343MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  On   | 00000000:09:00.0 Off |                  N/A |
|  0%   54C    P8    11W / 170W |      3MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     19139      C   python3                          7340MiB |
+-----------------------------------------------------------------------------+

llama.cpp: tags/b2548

Compile Steps

I followed the steps at https://github.com/ggerganov/llama.cpp/blob/master/README.md#cuda .

jasmine@mumei ~/llm/llama.cpp (tags/b2548) $ mkdir build && cd build
jasmine@mumei ~/llm/llama.cpp/build (tags/b2548) $ cmake .. -D LLAMA_CUDA=ON
-- The C compiler identification is GNU 12.2.0
-- The CXX compiler identification is GNU 12.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.39.2")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found CUDAToolkit: /usr/include (found version "11.8.89")
-- CUDA found
CMake Error at /usr/share/cmake-3.25/Modules/CMakeDetermineCompilerId.cmake:739 (message):
  Compiling the CUDA compiler identification source file
  "CMakeCUDACompilerId.cu" failed.

  Compiler: /usr/bin/nvcc

  Build flags:

  Id flags: --keep;--keep-dir;tmp -v

  The output was:

  1

  #$ _NVVM_BRANCH_=nvvm

  #$ _SPACE_=

  #$ _CUDART_=cudart

  #$ _HERE_=/usr/lib/nvidia-cuda-toolkit/bin

  #$ _THERE_=/usr/lib/nvidia-cuda-toolkit/bin

  #$ _TARGET_SIZE_=

  #$ _TARGET_DIR_=

  #$ _TARGET_SIZE_=64

  #$ NVVMIR_LIBRARY_DIR=/usr/lib/nvidia-cuda-toolkit/libdevice

  #$
  PATH=/usr/lib/nvidia-cuda-toolkit/bin:/home/jasmine/.cargo/bin:/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/go/bin
  #$ LIBRARIES= -L/usr/lib/x86_64-linux-gnu/stubs -L/usr/lib/x86_64-linux-gnu

  #$ rm tmp/a_dlink.reg.c

  #$ gcc -D__CUDA_ARCH__=520 -D__CUDA_ARCH_LIST__=520 -E -x c++
  -DCUDA_DOUBLE_MATH_FUNCTIONS -D__CUDACC__ -D__NVCC__
  -D__CUDACC_VER_MAJOR__=11 -D__CUDACC_VER_MINOR__=8
  -D__CUDACC_VER_BUILD__=89 -D__CUDA_API_VER_MAJOR__=11
  -D__CUDA_API_VER_MINOR__=8 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include
  "cuda_runtime.h" -m64 "CMakeCUDACompilerId.cu" -o
  "tmp/CMakeCUDACompilerId.cpp1.ii"

  #$ cicc --c++17 --gnu_version=120200 --display_error_number
  --orig_src_file_name "CMakeCUDACompilerId.cu" --orig_src_path_name
  "/home/jasmine/llm/llama.cpp/build/CMakeFiles/3.25.1/CompilerIdCUDA/CMakeCUDACompilerId.cu"
  --allow_managed -arch compute_52 -m64 --no-version-ident -ftz=0 -prec_div=1
  -prec_sqrt=1 -fmad=1 --include_file_name "CMakeCUDACompilerId.fatbin.c"
  -tused --gen_module_id_file --module_id_file_name
  "tmp/CMakeCUDACompilerId.module_id" --gen_c_file_name
  "tmp/CMakeCUDACompilerId.cudafe1.c" --stub_file_name
  "tmp/CMakeCUDACompilerId.cudafe1.stub.c" --gen_device_file_name
  "tmp/CMakeCUDACompilerId.cudafe1.gpu" "tmp/CMakeCUDACompilerId.cpp1.ii" -o
  "tmp/CMakeCUDACompilerId.ptx"

  #$ ptxas -arch=sm_52 -m64 "tmp/CMakeCUDACompilerId.ptx" -o
  "tmp/CMakeCUDACompilerId.sm_52.cubin"

  #$ fatbinary --create="tmp/CMakeCUDACompilerId.fatbin" -64
  --cicc-cmdline="-ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 "
  "--image3=kind=elf,sm=52,file=tmp/CMakeCUDACompilerId.sm_52.cubin"
  "--image3=kind=ptx,sm=52,file=tmp/CMakeCUDACompilerId.ptx"
  --embedded-fatbin="tmp/CMakeCUDACompilerId.fatbin.c"

  #$ gcc -D__CUDA_ARCH_LIST__=520 -E -x c++ -D__CUDACC__ -D__NVCC__
  -D__CUDACC_VER_MAJOR__=11 -D__CUDACC_VER_MINOR__=8
  -D__CUDACC_VER_BUILD__=89 -D__CUDA_API_VER_MAJOR__=11
  -D__CUDA_API_VER_MINOR__=8 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include
  "cuda_runtime.h" -m64 "CMakeCUDACompilerId.cu" -o
  "tmp/CMakeCUDACompilerId.cpp4.ii"

  #$ cudafe++ --c++17 --gnu_version=120200 --display_error_number
  --orig_src_file_name "CMakeCUDACompilerId.cu" --orig_src_path_name
  "/home/jasmine/llm/llama.cpp/build/CMakeFiles/3.25.1/CompilerIdCUDA/CMakeCUDACompilerId.cu"
  --allow_managed --m64 --parse_templates --gen_c_file_name
  "tmp/CMakeCUDACompilerId.cudafe1.cpp" --stub_file_name
  "CMakeCUDACompilerId.cudafe1.stub.c" --module_id_file_name
  "tmp/CMakeCUDACompilerId.module_id" "tmp/CMakeCUDACompilerId.cpp4.ii"

  #$ gcc -D__CUDA_ARCH__=520 -D__CUDA_ARCH_LIST__=520 -c -x c++
  -DCUDA_DOUBLE_MATH_FUNCTIONS -m64 "tmp/CMakeCUDACompilerId.cudafe1.cpp" -o
  "tmp/CMakeCUDACompilerId.o"

  /usr/include/c++/10/type_traits:71:52: error: redefinition of ‘constexpr
  const _Tp std::integral_constant<_Tp, __v>::value’

     71 |   template<typename _Tp, _Tp __v>
        |                                                    ^

  /usr/include/c++/10/type_traits:59:29: note: ‘constexpr const _Tp
  value’ previously declared here

     59 |       static constexpr _Tp                  value = __v;
        |                             ^~~~~

  # --error 0x1 --

Call Stack (most recent call first):
  /usr/share/cmake-3.25/Modules/CMakeDetermineCompilerId.cmake:6 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
  /usr/share/cmake-3.25/Modules/CMakeDetermineCompilerId.cmake:48 (__determine_compiler_id_test)
  /usr/share/cmake-3.25/Modules/CMakeDetermineCUDACompiler.cmake:307 (CMAKE_DETERMINE_COMPILER_ID)
  CMakeLists.txt:374 (enable_language)

-- Configuring incomplete, errors occurred!
See also "/home/jasmine/llm/llama.cpp/build/CMakeFiles/CMakeOutput.log".
See also "/home/jasmine/llm/llama.cpp/build/CMakeFiles/CMakeError.log".

CMakeOutput.log

CMakeError.log is empty.

Expected Results

I expected compilation to succeed. What issue is there with this system, environment, or steps?

slaren commented 7 months ago

It looks like cmake is failing to find the installation location of your CUDA toolkit. You can find more information about this is done and how to specify a location manually in https://cmake.org/cmake/help/latest/module/FindCUDAToolkit.html. Alternatively, following NVIDIA's CUDA toolkit installation instructions should work as expected.

freeone3000 commented 7 months ago

I do not think this is correct. CMake output contains "Found CUDAToolkit: /usr/include (found version "11.8.89")" from CMake's FindCUDA module. nvidia-smi and nvidia samples from CUDA toolkit compile correctly; the CUDA toolkit is installed to /usr/include as listed in the CMake output above.