abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
8.17k stars 974 forks source link

Prebuilt CUDA wheels not working #1822

Open mjwweb opened 2 weeks ago

mjwweb commented 2 weeks ago

There are multiple issues with the CUDA wheels:

  1. The cu125 repository returns 404:

    $ curl -I https://abetlen.github.io/llama-cpp-python/whl/cu125/
    HTTP/2 404
  2. While cu124 exists, pip fails to find wheels using --extra-index-url:

    pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu124
  3. Even direct wheel installation fails due to missing CUDA dependencies:

    pip install https://github.com/abetlen/llama-cpp-python/releases/download/v0.2.90-cu124/llama_cpp_python-0.2.90-cp312-cp312-linux_x86_64.whl

    Results in:

    RuntimeError: Failed to load shared library '.../libllama.so': libcudart.so.12: cannot open shared object file: No such file or directory

This setup was working a few weeks ago without requiring manual CUDA installation.

Environment:

mjwweb commented 2 weeks ago

The cu125 repo URL still returns 404.

I figured out the issue is related to my conda environment. The prebuild CUDA wheels work on my system level (Ubuntu) but not in my Conda environment.

Here's a workaround I found that works for me:

Install build-essential and libgomp1 from apt repository:

$ sudo apt install build-essential libgomp1

Set path to system libraries:

$ export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"

Then proceed with the prepuilt wheel installation.

daniter-fast commented 2 weeks ago

I found this to help:

$ sudo apt install gcc-11 $ sudo apt install g++-11 $ CXX=g++-11 CC=gcc-11 pip install llama-cpp-python