Open celsowm opened 10 months ago
My suggestion:
nvidia-cuda-toolkit
on UbuntuCUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade
My suggestion:
- uninstall
nvidia-cuda-toolkit
on Ubuntu- install 12.3 cuda toolkit from here: https://developer.nvidia.com/cuda-downloads
- build using:
CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade
Hi, thanks for your suggestion ! Meanwhile I downgrade my gcc13 to 12 and updated my nvidia cuda driver from 11.8 to 12.0, the result now is:
Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [133 lines of output]
*** scikit-build-core 0.7.1 using CMake 3.28.1 (wheel)
*** Configuring CMake...
loading initial cache file /tmp/tmp624g1wvw/build/CMakeInit.txt
-- The C compiler identification is GNU 12.3.0
-- The CXX compiler identification is GNU 12.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.40.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found CUDAToolkit: /usr/include (found version "12.0.140")
-- cuBLAS found
CMake Error at /tmp/pip-build-env-xj0kr9y6/normal/local/lib/python3.11/dist-packages/cmake/data/share/cmake-3.28/Modules/CMakeDetermineCompilerId.cmake:780 (message):
Compiling the CUDA compiler identification source file
"CMakeCUDACompilerId.cu" failed.
Compiler: /usr/bin/nvcc
Build flags:
Id flags: --keep;--keep-dir;tmp -v
The output was:
2
#$ _NVVM_BRANCH_=nvvm
#$ _SPACE_=
#$ _CUDART_=cudart
#$ _HERE_=/usr/lib/nvidia-cuda-toolkit/bin
#$ _THERE_=/usr/lib/nvidia-cuda-toolkit/bin
#$ _TARGET_SIZE_=
#$ _TARGET_DIR_=
#$ _TARGET_SIZE_=64
#$ NVVMIR_LIBRARY_DIR=/usr/lib/nvidia-cuda-toolkit/libdevice
#$
PATH=/usr/lib/nvidia-cuda-toolkit/bin:/tmp/pip-build-env-xj0kr9y6/overlay/local/bin:/tmp/pip-build-env-xj0kr9y6/normal/local/bin:/home/celso/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
#$ LIBRARIES= -L/usr/lib/x86_64-linux-gnu/stubs -L/usr/lib/x86_64-linux-gnu
#$ rm tmp/a_dlink.reg.c
#$ gcc -D__CUDA_ARCH_LIST__=520 -E -x c++ -D__CUDACC__ -D__NVCC__
-D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=0
-D__CUDACC_VER_BUILD__=140 -D__CUDA_API_VER_MAJOR__=12
-D__CUDA_API_VER_MINOR__=0 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include
"cuda_runtime.h" -m64 "CMakeCUDACompilerId.cu" -o
"tmp/CMakeCUDACompilerId.cpp4.ii"
#$ cudafe++ --c++17 --gnu_version=120300 --display_error_number
--orig_src_file_name "CMakeCUDACompilerId.cu" --orig_src_path_name
"/tmp/tmp624g1wvw/build/CMakeFiles/3.28.1/CompilerIdCUDA/CMakeCUDACompilerId.cu"
--allow_managed --m64 --parse_templates --gen_c_file_name
"tmp/CMakeCUDACompilerId.cudafe1.cpp" --stub_file_name
"CMakeCUDACompilerId.cudafe1.stub.c" --gen_module_id_file
--module_id_file_name "tmp/CMakeCUDACompilerId.module_id"
"tmp/CMakeCUDACompilerId.cpp4.ii"
/usr/local/include/cuda_runtime.h(654): error: the global scope has no
"cudaMemAdvise_v2"
/usr/local/include/cuda_runtime.h(666): error: the global scope has no
"cudaMemPrefetchAsync_v2"
/usr/local/include/cuda_runtime.h(2301): error: identifier "cudaKernel_t"
is undefined
/usr/local/include/cuda_runtime.h(2301): error: identifier "kernelPtr" is
undefined
/usr/local/include/cuda_runtime.h(2302): error: expected an expression
/usr/local/include/cuda_runtime.h(2303): error: too many initializer values
/usr/local/include/cuda_runtime.h(2304): error: expected a ";"
CMakeCUDACompilerId.cu(453): error: identifier "info_compiler" is undefined
8 errors detected in the compilation of "CMakeCUDACompilerId.cu".
# --error 0x2 --
Call Stack (most recent call first):
/tmp/pip-build-env-xj0kr9y6/normal/local/lib/python3.11/dist-packages/cmake/data/share/cmake-3.28/Modules/CMakeDetermineCompilerId.cmake:8 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
/tmp/pip-build-env-xj0kr9y6/normal/local/lib/python3.11/dist-packages/cmake/data/share/cmake-3.28/Modules/CMakeDetermineCompilerId.cmake:53 (__determine_compiler_id_test)
/tmp/pip-build-env-xj0kr9y6/normal/local/lib/python3.11/dist-packages/cmake/data/share/cmake-3.28/Modules/CMakeDetermineCUDACompiler.cmake:135 (CMAKE_DETERMINE_COMPILER_ID)
vendor/llama.cpp/CMakeLists.txt:306 (enable_language)
-- Configuring incomplete, errors occurred!
*** CMake configuration failed
[end of output]
@m-from-space your ideia worked it ! (combined with setting gcc-12 as default) now, how can I test if the lib is using gpu?
@m-from-space your ideia worked it ! (combined with setting gcc-12 as default) now, how can I test if the lib is using GPU?
Read the output, use it, and monitor the CUDA utilization.
-DLLAMA_CUBLAS=on
is what enables the GPU on NVIDIA hardware.
When run, you will see in the output where the GPU is being used for things like offloading. You will also know the GPU is in use when you monitor the CUDA utilization on the GPU during the response and see CUDA utilization.
Facing same issue here in anaconda. Trying cuda > 12 doesn't work. Do you have any ideas on how to fix it?
Building wheels for collected packages: llama-cpp-python
Building wheel for llama-cpp-python (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [46 lines of output]
*** scikit-build-core 0.7.1 using CMake 3.28.1 (wheel)
*** Configuring CMake...
2024-01-22 17:47:25,115 - scikit_build_core - WARNING - libdir/ldlibrary: /home/cannguyen/miniconda3/envs/env1/lib/libpython3.11.a is not a real file!
2024-01-22 17:47:25,115 - scikit_build_core - WARNING - Can't find a Python library, got libdir=/home/cannguyen/miniconda3/envs/env1/lib, ldlibrary=libpython3.11.a, multiarch=x86_64-linux-gnu, masd=None
loading initial cache file /tmp/tmpgi2n8624/build/CMakeInit.txt
-- The C compiler identification is GNU 13.2.0
-- The CXX compiler identification is GNU 13.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: /usr/bin/git (found version "2.40.1")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Unable to find cudart library.
-- Could NOT find CUDAToolkit (missing: CUDA_CUDART) (found version "12.1.105")
CMake Warning at vendor/llama.cpp/CMakeLists.txt:360 (message):
cuBLAS not found
-- CUDA host compiler is GNU
CMake Error at vendor/llama.cpp/CMakeLists.txt:536 (get_flags):
get_flags Function invoked with incorrect arguments for function named:
get_flags
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- x86 detected
CMake Warning (dev) at CMakeLists.txt:21 (install):
Target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
This warning is for project developers. Use -Wno-dev to suppress it.
CMake Warning (dev) at CMakeLists.txt:30 (install):
Target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
This warning is for project developers. Use -Wno-dev to suppress it.
-- Configuring incomplete, errors occurred!
*** CMake configuration failed
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects
I install this library by running this command:
CUDACXX=/home/cannguyen/miniconda3/envs/env1/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --upgrade
This is what I have related to CUDA in my conda environment:
> conda list | grep cuda
cuda-cudart 12.1.105 0 nvidia
cuda-cupti 12.1.105 0 nvidia
cuda-libraries 12.1.0 0 nvidia
cuda-nvcc 12.1.105 0 nvidia/label/cuda-12.1.1
cuda-nvrtc 12.1.105 0 nvidia
cuda-nvtx 12.1.105 0 nvidia
cuda-opencl 12.3.101 0 nvidia
cuda-runtime 12.1.0 0 nvidia
pytorch 2.1.1 py3.11_cuda12.1_cudnn8.9.2_0 pytorch
pytorch-cuda 12.1 ha16c6d3_5 pytorch
pytorch-mutex 1.0 cuda pytorch
Hello from above issue, I've passed it by:
conda install nvidia::cuda-nvcc
conda install nvidia::cuda-toolkit
conda install gcc=12 -c conda-forge
conda install -c conda-forge gxx_linux-64
After that, the installation will be stuck at:
[18/23] : && /usr/bin/g++ -fPIC -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,--allow-shlib-undefined -Wl,-rpath,/home/cannguyen/miniconda3/envs/env1/lib -Wl,-rpath-link,/home/cannguyen/miniconda3/envs/env1/lib -L/home/cannguyen/miniconda3/envs/env1/lib -shared -Wl,-soname,libggml_shared.so -o vendor/llama.cpp/libggml_shared.so vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-backend.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-quants.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-cuda.cu.o -L/home/cannguyen/miniconda3/envs/env1/lib/gcc/x86_64-conda-linux-gnu/12.3.0 -L/home/cannguyen/miniconda3/envs/env1/lib/gcc -L/home/cannguyen/miniconda3/envs/env1/x86_64-conda-linux-gnu/lib -L/home/cannguyen/miniconda3/envs/env1/x86_64-conda-linux-gnu/sysroot/lib -L/home/cannguyen/miniconda3/envs/env1/x86_64-conda-linux-gnu/sysroot/usr/lib /home/cannguyen/miniconda3/envs/env1/lib/libcudart.so /home/cannguyen/miniconda3/envs/env1/lib/libcublas.so /home/cannguyen/miniconda3/envs/env1/lib/libcublasLt.so /home/cannguyen/miniconda3/envs/env1/lib/stubs/libcuda.so -pthread /home/cannguyen/miniconda3/envs/env1/lib/libculibos.a -lcudadevrt -lcudart_static -lrt -lpthread -ldl -L"/home/cannguyen/miniconda3/envs/env1/lib/stubs" -L"/home/cannguyen/miniconda3/envs/env1/lib" && :
FAILED: vendor/llama.cpp/libggml_shared.so
: && /usr/bin/g++ -fPIC -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,--allow-shlib-undefined -Wl,-rpath,/home/cannguyen/miniconda3/envs/env1/lib -Wl,-rpath-link,/home/cannguyen/miniconda3/envs/env1/lib -L/home/cannguyen/miniconda3/envs/env1/lib -shared -Wl,-soname,libggml_shared.so -o vendor/llama.cpp/libggml_shared.so vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-backend.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-quants.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-cuda.cu.o -L/home/cannguyen/miniconda3/envs/env1/lib/gcc/x86_64-conda-linux-gnu/12.3.0 -L/home/cannguyen/miniconda3/envs/env1/lib/gcc -L/home/cannguyen/miniconda3/envs/env1/x86_64-conda-linux-gnu/lib -L/home/cannguyen/miniconda3/envs/env1/x86_64-conda-linux-gnu/sysroot/lib -L/home/cannguyen/miniconda3/envs/env1/x86_64-conda-linux-gnu/sysroot/usr/lib /home/cannguyen/miniconda3/envs/env1/lib/libcudart.so /home/cannguyen/miniconda3/envs/env1/lib/libcublas.so /home/cannguyen/miniconda3/envs/env1/lib/libcublasLt.so /home/cannguyen/miniconda3/envs/env1/lib/stubs/libcuda.so -pthread /home/cannguyen/miniconda3/envs/env1/lib/libculibos.a -lcudadevrt -lcudart_static -lrt -lpthread -ldl -L"/home/cannguyen/miniconda3/envs/env1/lib/stubs" -L"/home/cannguyen/miniconda3/envs/env1/lib" && :
/usr/bin/ld: cannot find /lib64/libpthread.so.0: No such file or directory
/usr/bin/ld: cannot find /usr/lib64/libpthread_nonshared.a: No such file or directory
collect2: error: ld returned 1 exit status
Searching around but cannot find any solutions. I found that install this library in conda env is pretty hard. As I want to try LLama models in Langchain, may I have your advice for an easier solution?
I'm getting the same issue using Llava with CUDA, was anyone able to fix this?
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
To install correctly
Current Behavior
error
Environment and Context
I am trying to install llama cpp on Ubuntu 23.10 using:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
But I got this error:
Its crazy because it says "found version "11.8.89" about CUDA !