abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
8.16k stars 970 forks source link

Docker llama-cpp libcuda.so.1: cannot open shared object file: No such file or directory #1169

Open Apotrox opened 9 months ago

Apotrox commented 9 months ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

Expected Behavior

Expected: Probably loading all necessary files as requested

Current Behavior

File "/home/worker/app/.venv/lib/python3.11/site-packages/llama_cpp/llama_cpp.py", line 76, in _load_shared_library 2024-02-09 17:32:34 raise RuntimeError(f"Failed to load shared library '{_lib_path}': {e}") 2024-02-09 17:32:34 RuntimeError: Failed to load shared library '/home/worker/app/.venv/lib/python3.11/site-packages/llama_cpp/libllama.so': libcuda.so.1: cannot open shared object file: No such file or directory

Environment and Context

I'm trying to set up privategpt in a Docker enviroment. In the Dockerfile, i specifially reinstalled the "newest" llama-cpp-python version, along with the necessary cuda libraries, to enable GPU Support. As this appears to be specifically a llama-cpp-python issue, i'm posting it here (too).

$ lscpu

# lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         48 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  24
  On-line CPU(s) list:   0-23
Vendor ID:               AuthenticAMD
  Model name:            AMD Ryzen 9 5900X 12-Core Processor
    CPU family:          25
    Model:               33
    Thread(s) per core:  2
    Core(s) per socket:  12
    Socket(s):           1
    Stepping:            2
    BogoMIPS:            7386.18
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_g
                         ood nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy 
                         svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 invpcid rdseed adx smap clflu
                         shopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsav
                         e_vmload umip vaes vpclmulqdq rdpid
Virtualization features: 
  Virtualization:        AMD-V
  Hypervisor vendor:     Microsoft
  Virtualization type:   full
Caches (sum of all):     
  L1d:                   384 KiB (12 instances)
  L1i:                   384 KiB (12 instances)
  L2:                    6 MiB (12 instances)
  L3:                    32 MiB (1 instance)
Vulnerabilities:         
  Gather data sampling:  Not affected
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec rstack overflow:  Mitigation; safe RET
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl and seccomp
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected

$ uname -a => Linux 1de939a0a313 5.15.133.1-microsoft-standard-WSL2

$ python3 --version => 3.11.6
$ make --version => 4.3
$ g++ --version => 12.2.0
Apotrox commented 9 months ago

well, i don't entirely know why, but after trying docker compose up everything worked. Not sure if this issue will be of relevance so i'll leave it open for now.

timtensor commented 9 months ago

Tried in colab notebook . seems to be the same issue

tolgakurtuluss commented 9 months ago

Tried in colab network AutoGGUF.ipynb and get the similar error as given below.

./llama.cpp/quantize: error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or directory

tolgakurtuluss commented 9 months ago

After installing cuda 10 into colab with below code, it finally worked!

install cuda 10.0

!apt-get update; !wget https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda-repo-ubuntu1604-10-0-local-10.0.130-410.48_1.0-1_amd64 -O cuda-repo-ubuntu1604-10-0-local-10.0.130-410.48_1.0-1_amd64.deb !dpkg -i cuda-repo-ubuntu1604-10-0-local-10.0.130-410.48_1.0-1_amd64.deb !apt-key add /var/cuda-repo-10-0-local/7fa2af80.pub !apt-get update !apt-get -y install gcc-7 g++-7 !apt-get -y install cuda

!export PATH=/usr/local/cuda/bin${PATH:+:${PATH}} !export LD_LIBRARY_PATH=/usr/local/cuda/lib64\${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

check if installed successfully

!/usr/local/cuda/bin/nvcc --version

Apotrox commented 9 months ago

@tolgakurtuluss thank you for that info, sadly i can't go lower than Cuda 12 as there appears to be no slim debian docker image that supports gcc below version 12 (at least that's the error message im getting when trying to go lower) and Cuda 11 doesnt support gcc-12... I don't want to stray too far from the original dockerfile as i don't want to be bothered with the compatibility hell.

I have been trying around a lot and it appears to be more of a deeper issue. I figured maybe it just doesnt detect the gpu when building the docker image and put the installation of cuda and llama-cpp-python into an entrypoint script to run when starting a container. Still no success. Even manually creating the symlinks and adding the respective paths for libcuda.so and libcuda.so.1 to PATH and LD_LIBRARY_PATH doesn't seem to fix it. I'm kinda at a loss here.

kinoute commented 9 months ago

Same problem here. We kinda made it work for sometimes (in a Kubernetes cluster) by running the re-install of the package at runtime but sometimes, it just failed with this error.

# Running when the container is initialized
 CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=all-major" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.28 --no-cache-dir --force-reinstall --upgrade
celery worker...

Getting

   File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_cpp.py", line 74, in _load_shared_library
     return ctypes.CDLL(str(_lib_path), **cdll_args)
   File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
     self._handle = _dlopen(self._name, mode)
 OSError: libcuda.so.1: cannot open shared object file: No such file or directory

This makes no sense because sometimes, just killing the container and restarting make it work. Here is the Dockerfile:

# Use Docker Hub by default
ARG CONTAINER_REGISTRY=docker.io
FROM $CONTAINER_REGISTRY/nvidia/cuda:12.1.1-devel-ubuntu22.04 as base

# Redirect python output straight to the terminal
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED 1
ENV PYTHONPATH="$PYTHONPATH:/modules/:/modules/osint"

# setting build related env vars
ENV CUDA_DOCKER_ARCH=all
ENV LLAMA_CUBLAS=1

RUN apt-get update && apt-get upgrade -y && \
    apt-get install -y git build-essential gcc wget python3 python3-pip \
    ocl-icd-opencl-dev opencl-headers clinfo \
    libclblast-dev libopenblas-dev \
    && mkdir -p /etc/OpenCL/vendors && echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd

ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda-12/targets/x86_64-linux/lib/:/usr/lib/x86_64-linux-gnu/"

# Install main python dependencies
COPY ./requirements.txt .
RUN /bin/bash -c "pip3 install --no-cache-dir -r requirements.txt"

WORKDIR /modules/

COPY . .

We are using an instance with a RTX 3070.

lukestanley commented 8 months ago

"Expected: Probably loading all necessary files as requested" This doesn't really seem like a bug with this project, but the underlying CUDA compute layer. This is a Docker configuration issue, maybe even your Docker cache, who knows. I suggest closing this. No point having this ticket open. I guess you can still chat here but there is no project bug here. It help maintainers to have clear signals. Thanks. @Apotrox @kinoute

limoncc commented 8 months ago

libcuda.so.1: cannot open shared object file: No such file or directory==>Most likely you didn't turn on your gpu: --gpus all, Of course, you need to install nvidia-container-toolkit on the host before doing so

docker run  -itd --name llm_serve -h limoncc \
--gpus all \
...
KestindotC commented 8 months ago

Thanks for the helpful tip, @superuser222. We were able to resolve the issue by installing the Nvidia device plugin, which is a plugin for Kubernetes and is deployed as a DaemonSet.

everton137 commented 7 months ago

I was able to compile llama.cpp using Docker by following this example using a CPU and it worked almost out of the box:

https://github.com/turiPO/llamacpp-docker-server/tree/main

For a GPU, I had to install the Nvidia Toolkit installed on the GPU server I have access to, 12.3, by essentially adding this on the Dockerfile:

https://developer.nvidia.com/cuda-12-3-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_local

I was getting the same library errors of this issue while trying to execute llama.cpp, but I could find which ones were missing and where they were because of a running container using Nvidia on this server. Just sharing some draft code with a solution:

# Set CUDA_HOME environment variable
ENV CUDA_HOME=/usr/local/cuda
# Update PATH to include CUDA binaries
ENV PATH=${CUDA_HOME}/bin:${PATH}
# Update LD_LIBRARY_PATH to include CUDA libraries
ENV LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}

# Copy CUDA libraries from the build stage to the runtime stage
COPY --from=build ${CUDA_HOME}/lib64 /usr/local/cuda/lib64
COPY --from=build /usr/lib/x86_64-linux-gnu/libcuda.so.1 /usr/local/cuda/lib64
COPY --from=build /usr/local/cuda-12.3/targets/x86_64-linux/lib/libcublas.so.12 /usr/local/cuda/lib64
COPY --from=build /usr/local/cuda-12.3/targets/x86_64-linux/lib/libcudart.so.12 /usr/local/cuda/lib64
COPY --from=build /usr/local/cuda-12.3/targets/x86_64-linux/lib/libcublasLt.so.12 /usr/local/cuda/lib64

I think it will be a matter of checking where these libraries are installed and copy them to LD_LIBRARY_PATH, which can depend on the Nvidia Toolkit version and your OS.

LeonHammerla commented 6 months ago

Same problem here

k-praveen-trellis commented 4 months ago

Same here as well. Building the container on a T4 GPU based machine, it keeps on throwing the same error.

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_cpp.py", line 70, in _load_shared_library
    return ctypes.CDLL(str(_lib_path), **cdll_args)  # type: ignore
  File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcuda.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
    from llama_cpp import Llama
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/__init__.py", line 1, in <module>
    from .llama_cpp import *
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_cpp.py", line 83, in <module>
    _lib = _load_shared_library(_lib_base_name)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_cpp.py", line 72, in _load_shared_library
    raise RuntimeError(f"Failed to load shared library '{_lib_path}': {e}")
RuntimeError: Failed to load shared library '/usr/local/lib/python3.10/dist-packages/llama_cpp/libllama.so': libcuda.so.1: cannot open shared object file: No such file or directory

Dockerfile

FROM nvidia/cuda:12.3.2-cudnn9-devel-ubuntu22.04

RUN apt-get update && \
    apt-get install -y \
    python3.10 \
    python3-pip \
    python3.10-venv \
    python3.10-dev \
    ffmpeg && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

RUN apt-get update && apt-get install sox -y

RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1

# Uncomment the following for GPU Environment Only
ENV PYTHONIOENCODING=UTF-8
ENV PYTHONUNBUFFERED=1
ENV NVIDIA_DRIVER_CAPABILITIES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility

## For GPU
# RUN pip install llama-cpp-python \
# --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu123
# RUN CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python --upgrade --force-reinstall --no-cache-dir
# RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=all-major" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.28 --no-cache-dir --force-reinstall --upgrade
RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python==0.2.78 --no-cache-dir --force-reinstall

Even tried out multiple downgraded versions. So either it runs but does not support the latest model arcchitecture, or the above error keeps popping up.

qiujie8092916 commented 2 months ago

I encountered a similar issue as well, and after continuously trying to make modifications, I no longer see similar error messages.

There is my Dockerfile:

ARG CUDA_IMAGE="12.5.0-devel-ubuntu22.04"

FROM nvidia/cuda:${CUDA_IMAGE}

ENV DEBIAN_FRONTEND="noninteractive" TZ="Etc/UTC"

ENV CUDA_HOME=/usr/local/cuda-12.5
ENV PATH=${CUDA_HOME}/bin:${PATH}
ENV LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${CUDA_HOME}/compat:${CUDA_HOME}/targets/x86_64-linux/lib:/usr/lib/x86_64-linux-gnu:${LD_LIBRARY_PATH}

RUN apt-get update && apt-get upgrade -y && \
    apt-get install -y git build-essential \
    python3 python3-pip gcc wget \
    ocl-icd-opencl-dev opencl-headers clinfo \
    libclblast-dev libopenblas-dev && \
    mkdir -p /etc/OpenCL/vendors && echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd && \
    ln -s /usr/bin/python3 /usr/bin/python && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* /tmp/*

WORKDIR /app

COPY requirements.txt ./

ENV GGML_CUDA=1
ENV FORCE_CMAKE=1
ENV CUDA_DOCKER_ARCH=all
#ENV TOKENIZERS_PARALLELISM=true
ENV CMAKE_ARGS="-DGGML_CUDA=on"

RUN python -m pip install --upgrade pip pytest cmake scikit-build setuptools fastapi uvicorn sse-starlette pydantic-settings starlette-context && \
    python -m pip install --no-cache-dir -r requirements.txt --verbose

COPY entrypoint.sh .
COPY app/ app/

ENV HOST=0.0.0.0
ENV PORT=8080

RUN chmod +x entrypoint.sh

ENTRYPOINT ["/app/entrypoint.sh"]

EXPOSE 8080

Don't know if it will help.

shovon2464 commented 1 month ago

libcuda.so.1: cannot open shared object file: No such file or directory==>Most likely you didn't turn on your gpu: --gpus all, Of course, you need to install nvidia-container-toolkit on the host before doing so

docker run  -itd --name llm_serve -h limoncc \
--gpus all \
...

Thanks a lot, it helped me to solve the issue.