ggerganov / llama.cpp

LLM inference in C/C++
MIT License
64.87k stars 9.3k forks source link

GPU is not working well. #6736

Closed lucky0604 closed 4 months ago

lucky0604 commented 4 months ago

Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bug.

Expected Behavior

I am following the readme as close as possible as the build process seems very specific. Expected behavior is compilation completes for windows with a cuda gpu.

Current Behavior

build the project with command:

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
mkdir build
cd build
cmake .. -DLLAMA_CUDA=ON
# everything goes well but a warning: 
# ...
# CMake Warning:
#  Manually-specified variables were not used by the project:

#    LLAMA_CUDA
# ...
cmake --build . --config Release
# there's no error but some unicode warning like [warning C4819]

then I enter into build/bin/Release directory and run

main.exe -m D:\models\gemma-2b-it\gemma-2b-it.gguf --prompt "hello" -ngl 40

GPU is not working. image

when I was running test-backend-ops.exe, the result is:

Testing 1 backends

Backend 1/1 (CPU)
  Skipping CPU backend
1/1 backends passed
OK

Environment and Context

GPU: 4060 System Info: Windows 11 CUDA: 12.1 cuDNN: 9.0.0

Jeximo commented 4 months ago

CMake Warning: Manually-specified variables were not used by the project: LLAMA_CUDA

For whatever reason, cmake didn't detect cuda during build. Maybe update Cuda Toolkit, and try again.

lucky0604 commented 4 months ago

Fixed. When rebuilding using cmake, an error is thrown indicating that a cuda package doesn't support vs 2022 community version. Changing to vs2019 professional and cuda toolkit 12.2 resolves the issue.