Closed Kas1o closed 1 month ago
I got the same error only using Llama-3-DARE-8B.IQ3_M.gguf
GGML_ASSERT: d:\a\koboldcpp\koboldcpp\ggml-cuda\dmmv.cu:804: false
It's my first time attempting to use one of these IQ GGUFs so I guess it's related to that?
I think I found the reason:
there's two algos here: dmmv
and mmvq
, It is selected based on cuda device Compute Capability.
MMVQ for 6.1/Pascal/GTX 1000 or higher)
and the point is dmmv does not support importance matrix.
by the way. The selection was base on the oldest gpu on your system. Even if you select a newer GPU on the startup screen.
The model I use: https://huggingface.co/dranger003/c4ai-command-r-plus-iMat.GGUF/blob/main/ggml-c4ai-command-r-plus-104b-iq1_s.gguf
by check the dmmv.cu:804 file, I noticed It doesn't contains case IQ1_S. Is it not supported?