Prerequisites

Please answer the following questions for yourself before submitting an issue.

[X ] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
[ X] I carefully followed the README.md.
[ X] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
[ X] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

I expect CUDA support to work, or not to be claimed to work.

CUDA does not work.

Win11 x64, followed all instructions. Still runs 100% CPU.

$ lscpu

$ uname -a

$ python3 --version
$ make --version
$ g++ --version

Your instructions are wrong.

There is no setup.py.

Try the following:

git clone https://github.com/abetlen/llama-cpp-python
cd llama-cpp-python
rm -rf _skbuild/ # delete any old builds
python setup.py develop
cd ./vendor/llama.cpp
Follow llama.cpp's instructions to cmake llama.cpp
Run llama.cpp's ./main with the same arguments you previously passed to llama-cpp-python and see if you can reproduce the issue. If you can, log an issue with llama.cpp

There is no setup.py. Nothing else matters because your instructions don't even match the reality of which files exist.