Closed warren-lei closed 2 months ago
Currently, it seems that the wrong output of Vulkan may be caused by data type conversion issues. For the Q4 model (4-bit, ggml-model-q4_k.gguf), setting ngl
to 11 starts to cause some wrong output, and the higher the setting layers of ngl
, the more errors occur. For the F16 model, it can provide correct answers with ngl
set to 18, but when ngl
is set to 19 , errors begin to occur.
Resolved. Close the issue.
What happened?
command of compilation:
cmake .. -DGGML_VULKAN=ON -DCMAKE_BUILD_TYPE=Release
command of running:
./bin/llama-cli -m ggml-model-q4_k.gguf -c 512 -b 1024 -n 256 --keep 48 --repeat_penalty 1.0 --color -i -r "User:" -f ../prompts/chat-with-bob.txt -ngl 33
result:
Name and Version
system version:
Vulkan verison:
libvulkan.so.1.3.289
. Only compile vulkan loader.What operating system are you seeing the problem on?
No response
Relevant log output