GPU is not working well.

lucky0604 commented 4 months ago

Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bug.

[x] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
[x] I carefully followed the README.md.
[x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
[x] I reviewed the Discussions, and have a new bug or useful enhancement to share. If the bug concerns the server, please try to reproduce it first using the server test scenario framework.

Expected Behavior

I am following the readme as close as possible as the build process seems very specific. Expected behavior is compilation completes for windows with a cuda gpu.

Current Behavior

build the project with command:

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
mkdir build
cd build
cmake .. -DLLAMA_CUDA=ON
# everything goes well but a warning: 
# ...
# CMake Warning:
#  Manually-specified variables were not used by the project:

#    LLAMA_CUDA
# ...
cmake --build . --config Release
# there's no error but some unicode warning like [warning C4819]

then I enter into build/bin/Release directory and run

main.exe -m D:\models\gemma-2b-it\gemma-2b-it.gguf --prompt "hello" -ngl 40

GPU is not working.

when I was running test-backend-ops.exe, the result is:

Testing 1 backends

Backend 1/1 (CPU)
  Skipping CPU backend
1/1 backends passed
OK

Environment and Context

GPU: 4060 System Info: Windows 11 CUDA: 12.1 cuDNN: 9.0.0

Jeximo commented 4 months ago

CMake Warning: Manually-specified variables were not used by the project: LLAMA_CUDA

For whatever reason, cmake didn't detect cuda during build. Maybe update Cuda Toolkit, and try again.

lucky0604 commented 4 months ago

Fixed. When rebuilding using cmake, an error is thrown indicating that a cuda package doesn't support vs 2022 community version. Changing to vs2019 professional and cuda toolkit 12.2 resolves the issue.

ggerganov / llama.cpp