Prebuilt CUDA wheels not working

mjwweb commented 2 weeks ago

There are multiple issues with the CUDA wheels:

The cu125 repository returns 404:

$ curl -I https://abetlen.github.io/llama-cpp-python/whl/cu125/
HTTP/2 404

While cu124 exists, pip fails to find wheels using --extra-index-url:

pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu124

Even direct wheel installation fails due to missing CUDA dependencies:

pip install https://github.com/abetlen/llama-cpp-python/releases/download/v0.2.90-cu124/llama_cpp_python-0.2.90-cp312-cp312-linux_x86_64.whl

Results in:

RuntimeError: Failed to load shared library '.../libllama.so': libcudart.so.12: cannot open shared object file: No such file or directory

This setup was working a few weeks ago without requiring manual CUDA installation.

Environment:

Python 3.12
pip 24.2
WSL2 Ubuntu

mjwweb commented 2 weeks ago

The cu125 repo URL still returns 404.

I figured out the issue is related to my conda environment. The prebuild CUDA wheels work on my system level (Ubuntu) but not in my Conda environment.

Here's a workaround I found that works for me:

Install build-essential and libgomp1 from apt repository:

$ sudo apt install build-essential libgomp1

Set path to system libraries:

$ export LD_LIBRARY_PATH="/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH"

Then proceed with the prepuilt wheel installation.

daniter-fast commented 2 weeks ago

I found this to help:

$ sudo apt install gcc-11 $ sudo apt install g++-11 $ CXX=g++-11 CC=gcc-11 pip install llama-cpp-python

abetlen / llama-cpp-python

Prebuilt CUDA wheels not working #1822