abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
7.98k stars 949 forks source link

ERROR: Can't install llama-cpp-python[server] WITH CMAKE_ARGS="-DLLAMA_CUBLAS=on" In VERSION 0.2.24 or higher #1126

Open victordonat0 opened 9 months ago

victordonat0 commented 9 months ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

Expected Behavior

when run !CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python[server] should install as expected. but only install on version <= 0.2.23, on 0.2.24 or higher not install

Current Behavior

Building wheel for llama-cpp-python (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [86 lines of output]
      *** scikit-build-core 0.8.0 using CMake 3.22.1 (wheel)
      *** Configuring CMake...
      2024-01-24 17:27:52,450 - scikit_build_core - WARNING - libdir/ldlibrary: /opt/conda/lib/libpython3.10.a is not a real file!
      2024-01-24 17:27:52,450 - scikit_build_core - WARNING - Can't find a Python library, got libdir=/opt/conda/lib, ldlibrary=libpython3.10.a, multiarch=x86_64-linux-gnu, masd=None
      /usr/bin/cmake: /opt/conda/lib/libcurl.so.4: no version information available (required by /usr/bin/cmake)
      loading initial cache file /tmp/tmpw9fhwjga/build/CMakeInit.txt
      -- The C compiler identification is GNU 11.4.0
      -- The CXX compiler identification is GNU 11.4.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /usr/bin/cc - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /usr/bin/c++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Found Git: /usr/bin/git (found version "2.34.1")
      -- Looking for pthread.h
      -- Looking for pthread.h - found
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
      -- Found Threads: TRUE
      -- Found CUDAToolkit: /usr/local/cuda/include (found version "11.8.89")
      -- cuBLAS found
      -- The CUDA compiler identification is NVIDIA 11.8.89
      -- Detecting CUDA compiler ABI info
      -- Detecting CUDA compiler ABI info - done
      -- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
      -- Detecting CUDA compile features
      -- Detecting CUDA compile features - done
      -- Using CUDA architectures: 52;61;70
      -- CUDA host compiler is GNU 11.4.0

      -- CMAKE_SYSTEM_PROCESSOR: x86_64
      -- x86 detected
      INSTALL TARGETS - target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
      INSTALL TARGETS - target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION.
      -- Configuring done
      CMake Error at vendor/llama.cpp/CMakeLists.txt:782 (add_library):
        Target "ggml_shared" links to target "CUDA::cuda_driver" but the target was
        not found.  Perhaps a find_package() call is missing for an IMPORTED
        target, or an ALIAS target is missing?

      CMake Error at vendor/llama.cpp/CMakeLists.txt:789 (add_library):
        Target "llama" links to target "CUDA::cuda_driver" but the target was not
        found.  Perhaps a find_package() call is missing for an IMPORTED target, or
        an ALIAS target is missing?

      CMake Error at vendor/llama.cpp/CMakeLists.txt:789 (add_library):
        Target "llama" links to target "CUDA::cuda_driver" but the target was not
        found.  Perhaps a find_package() call is missing for an IMPORTED target, or
        an ALIAS target is missing?

      CMake Error at vendor/llama.cpp/examples/llava/CMakeLists.txt:20 (add_library):
        Target "llava_shared" links to target "CUDA::cuda_driver" but the target
        was not found.  Perhaps a find_package() call is missing for an IMPORTED
        target, or an ALIAS target is missing?

      CMake Error at vendor/llama.cpp/examples/llava/CMakeLists.txt:34 (add_executable):
        Target "llava-cli" links to target "CUDA::cuda_driver" but the target was
        not found.  Perhaps a find_package() call is missing for an IMPORTED
        target, or an ALIAS target is missing?

      CMake Error at vendor/llama.cpp/CMakeLists.txt:756 (add_library):
        Target "ggml" links to target "CUDA::cuda_driver" but the target was not
        found.  Perhaps a find_package() call is missing for an IMPORTED target, or
        an ALIAS target is missing?

      CMake Error at vendor/llama.cpp/examples/llava/CMakeLists.txt:1 (add_library):
        Target "llava" links to target "CUDA::cuda_driver" but the target was not
        found.  Perhaps a find_package() call is missing for an IMPORTED target, or
        an ALIAS target is missing?

      -- Generating done
      CMake Generate step failed.  Build files cannot be regenerated correctly.

      *** CMake configuration failed
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

Environment and Context

Kaggle machine, JUPYTER NOTEBOOK

Wed Jan 24 17:58:50 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P100-PCIE-16GB           Off | 00000000:00:04.0 Off |                    0 |
| N/A   39C    P0              27W / 250W |      0MiB / 16384MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

$ lscpu

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  4
  On-line CPU(s) list:   0-3
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Xeon(R) CPU @ 2.00GHz
    CPU family:          6
    Model:               85
    Thread(s) per core:  2
    Core(s) per socket:  2
    Socket(s):           1
    Stepping:            3
    BogoMIPS:            4000.28
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc
                         a cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscal
                         l nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopo
                         logy nonstop_tsc cpuid tsc_known_freq pni pclmulqdq sss
                         e3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes 
                         xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefe
                         tch invpcid_single pti ssbd ibrs ibpb stibp fsgsbase ts
                         c_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx a
                         vx512f avx512dq rdseed adx smap clflushopt clwb avx512c
                         d avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves arat
                          md_clear arch_capabilities
Virtualization features: 
  Hypervisor vendor:     KVM
  Virtualization type:   full
Caches (sum of all):     
  L1d:                   64 KiB (2 instances)
  L1i:                   64 KiB (2 instances)
  L2:                    2 MiB (2 instances)
  L3:                    38.5 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-3
Vulnerabilities:         
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Mitigation; PTE Inversion
  Mds:                   Mitigation; Clear CPU buffers; SMT Host state unknown
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Vulnerable: Clear CPU buffers attempted, no microcode; 
                         SMT Host state unknown
  Retbleed:              Mitigation; IBRS
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
                          and seccomp
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer
                          sanitization
  Spectre v2:            Mitigation; IBRS, IBPB conditional, STIBP conditional, 
                         RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Mitigation; Clear CPU buffers; SMT Host state unknown

$ uname -a

Linux 4d4231b8c112 5.15.133+ #1 SMP Tue Dec 19 13:14:11 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
$ python3 --version  Python 3.10.12
$ make --version GNU Make 4.3 Built for x86_64-pc-linux-gnu
$ g++ --version g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
cesarandreslopez commented 8 months ago

Same issue noticed here on Ubuntu 22.04 with following specs:

image

Python 3.10.8
GNU Make 4.3 Built for x86_64-pc-linux-gnu
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
hamza233 commented 8 months ago

Facing the same error in installation

rigvedrs commented 8 months ago

I am facing the same error too. I am trying to run it on kaggle.

It used to work earlier if we used:

!CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
!git clone https://github.com/ggerganov/llama.cpp.git

Here is an example notebook where it worked: https://www.kaggle.com/code/gpreda/fast-test-of-llama-v2-pre-quantized-with-llama-cpp

samyasma commented 8 months ago

Same over here! Interested in a way to solve this issue

dwalter commented 8 months ago

In the meantime, this worked for me for v0.2.23:

CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 \
pip install llama-cpp-python[server]==0.2.23
horiacristescu commented 7 months ago

I tried

CMAKE_ARGS=-DLLAMA_CUBLAS=on FORCE_CMAKE=1 pip install --force-reinstall --no-cache-dir --upgrade llama-cpp-python==0.2.23

and I get

      CMake Error at vendor/llama.cpp/CMakeLists.txt:725 (target_link_libraries):
        Target "ggml" links to:

          CUDA::cublasLt

        but the target was not found.  Possible reasons include:

          * There is a typo in the target name.
          * A find_package call is missing for an IMPORTED target.
          * An ALIAS target is missing.
pinchedsquare commented 7 months ago

Same here can't install a CuBLAS version. It only installs the CPU version so can't use the 2 T4s.

Any help is appreciated.

p1rate5s commented 7 months ago

Same here on Ubuntu 22.04 and CUDA 12.4.99. In my case, trying version==0.2.23 or even 0.2.22 gave the same errors.


Downloading llama_cpp_python-0.2.56.tar.gz (36.9 MB) ....

Building wheels for collected packages: llama-cpp-python Running command Building wheel for llama-cpp-python (pyproject.toml) scikit-build-core 0.8.2 using CMake 3.28.3 (wheel) Configuring CMake... 2024-03-15 03:55:55,466 - scikit_build_core - WARNING - libdir/ldlibrary: /home/jupyter-romp/.conda/envs/llama/lib/libpython3.10.a is not a real file! 2024-03-15 03:55:55,466 - scikit_build_core - WARNING - Can't find a Python library, got libdir=/home/jupyter-romp/.conda/envs/llama/lib, ldlibrary=libpython3.10.a, multiarch=x86_64-linux-gnu, masd=None loading initial cache file /tmp/tmp1zkis1hd/build/CMakeInit.txt -- The C compiler identification is GNU 11.4.0 -- The CXX compiler identification is GNU 11.4.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /usr/bin/c++ - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/bin/git (found version "2.34.1") -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Found CUDAToolkit: /home/jupyter-rmurphy/.conda/envs/llama/targets/x86_64-linux/include (found version "12.4.99") -- cuBLAS found -- The CUDA compiler identification is NVIDIA 12.4.99 -- Detecting CUDA compiler ABI info -- Detecting CUDA compiler ABI info - done -- Check for working CUDA compiler: /home/jupyter-rmurphy/.conda/envs/llama/bin/nvcc - skipped -- Detecting CUDA compile features -- Detecting CUDA compile features - done -- Using CUDA architectures: 52;61;70 -- CUDA host compiler is GNU 11.4.0

-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with LLAMA_CCACHE=OFF -- CMAKE_SYSTEM_PROCESSOR: x86_64 -- x86 detected CMake Warning (dev) at CMakeLists.txt:21 (install): Target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION. This warning is for project developers. Use -Wno-dev to suppress it.

CMake Warning (dev) at CMakeLists.txt:30 (install): Target llama has PUBLIC_HEADER files but no PUBLIC_HEADER DESTINATION. This warning is for project developers. Use -Wno-dev to suppress it.

-- Configuring done (3.3s) CMake Error at vendor/llama.cpp/CMakeLists.txt:1115 (target_link_libraries): Target "ggml" links to:

  CUDA::cublas

but the target was not found.  Possible reasons include:

  * There is a typo in the target name.
  * A find_package call is missing for an IMPORTED target.
  * An ALIAS target is missing.

CMake Error at vendor/llama.cpp/CMakeLists.txt:1122 (target_link_libraries): Target "ggml_shared" links to:

  CUDA::cublas

but the target was not found.  Possible reasons include:

  * There is a typo in the target name.
  * A find_package call is missing for an IMPORTED target.
  * An ALIAS target is missing.

CMake Error at vendor/llama.cpp/CMakeLists.txt:1136 (target_link_libraries): Target "llama" links to:

  CUDA::cublas

but the target was not found.  Possible reasons include:

  * There is a typo in the target name.
  * A find_package call is missing for an IMPORTED target.
  * An ALIAS target is missing.
markuscircly commented 7 months ago

Testing on a Docker container with the following specifications:

Ubuntu 22.04 Python 3.10.12 Driver Version: 550.54.14 NVIDIA-SMI 550.54.14 CUDA Version: 12.4 First, check if an Cuda architecture is detected using the following commands:

nvidia-smi nvcc --version

Both commands should provide an output. Finally, updating the GPU driver fix my issues.

Used Code for GPU USING:

CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=all-major" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade

p1rate5s commented 7 months ago

After a few hours of tinkering, I just ended up removing the whole Conda environment, along with a couple of others that also had CUDA toolkit in them, and then built a new Conda env. from scratch. The install/compile went smoothly after that. I did notice a couple of packages with updated versions, so not sure exactly the fix- version issues or weird cross-environment things.

BahaSlama77 commented 7 months ago

@p1rate5s can u help me how to set up a new conda env in kaggle ?

p1rate5s commented 7 months ago

@p1rate5s can u help me how to set up a new conda env in kaggle ?

Sorry, I don't use Kaggle. My tinkering is on a bare metal server running Ubuntu. Below are the steps I took to create an env with most tools we would use in our lab, but I certainly cannot recommend them since I am no expert. I have used Python and pipenv before but new to Conda- and I am not sure mixing Conda and pip installs are a really good idea. So, YMMV.


from a jupyterlab/hub user env-

    - sudo -E conda create -n llama -c rapidsai -c conda-forge -c nvidia      rapids=24.02 python=3.10 cuda-version=12.4     dash streamlit pytorch cupy
    - python -m ipykernel install --user --name llama --display-name "llama"
    - conda activate llama
    - export CMAKE_ARGS="-DLLAMA_CUBLAS=on"
    - export FORCE_CMAKE=1
    - pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --verbose
    - pip install llama-index
    - pip install llama-index-readers-file
    - pip install llama-index-vector-stores-postgres
    - pip install llama-index-embeddings-huggingface
    - pip install llama-index-llms-llama-cpp