Open itinance opened 3 weeks ago
For the sake of completeness: if someone is looking for a llama.cpp binding that works with CUDA support, no matter which underlying programming language, I can recommend the NodeJS bindings: https://github.com/withcatai/node-llama-cpp
Installation was blazing fast and easy. It just works.
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
Although llama.cpp is installed perfectly and uses my GPU (RTX 4000), the Python bindings don't use the GPU. They stick with the CPU. I also had no impression that any CUDA-related libraries were used during the installation of this package.
1. GPU not used at all
should utilize GPU. It doesn't.
2. Installation seems to not install any CUDA related libraries
Please provide a detailed written description of what you were trying to do, and what you expected
llama-cpp-python
to do.Installation:
CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python
Code:
According to logfiles, only CPU is being used. According to
gpustat
, GPU is not used at all, stays at 0%.Running llama.cpp on the same machine uses CUDA/GPU a lot with the appropriate setting, both directly executed on the host and also via docker-container.
Running ollama uses also my GPU without any issues. My GPU is RTX 4000.
Current Behavior
Please provide a detailed written description of what
llama-cpp-python
did, instead.It uses only CPU.
Environment and Context
Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.
$ lscpu
$ uname -a
Linux Ubuntu-2204-jammy-amd64-base 5.15.0-118-generic #128-Ubuntu SMP Fri Jul 5 09:28:59 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Python 3.10.12 GNU Make 4.3 g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Failure Information (for bugs)
Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.
Steps to Reproduce
already written above
Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.
already written above
Note: Many issues seem to be regarding functional or performance issues / differences with
llama.cpp
. In these cases we need to confirm that you're comparing against the version ofllama.cpp
that was built with your python package, and which parameters you're passing to the context.Try the following:
git clone https://github.com/abetlen/llama-cpp-python
cd llama-cpp-python
rm -rf _skbuild/
# delete any old buildspython -m pip install .
cd ./vendor/llama.cpp
cmake
llama.cpp./main
with the same arguments you previously passed to llama-cpp-python and see if you can reproduce the issue. If you can, log an issue with llama.cppFailure Logs
The test installation from the point above ("try the following") gives this:
Example environment info: