Open JeremyBickel opened 8 months ago
Same issue here. Windows 11, Cuda 11.8/12.3, Python 3.12/3.11, model llama-2-13b-chat.Q8_0.gguf
, same output.
Update: Got it fixed. It turns out that my CPU does not support AVX2, so I cloned the repo and edited the CMAKE config to only use AVX and installed in that way. Then it's possible to run the model. Install CMake and take a look at guidance
branch, the installation guide shows you how to do it.
P.S. I also got into problems of Cublas Error: 13
. It turned out to be an error related to multiple GPUs, one has to specify which GPU to use, although the program prints that one has been selected. To do so, type this in PowerShell
$env:CUDA_VISIBLE_DEVICES=1
This command selects the 1
-th GPU.
CUDA is working:
(ct) C:\Users\Jeremy\Documents>python Python 3.11.6 (tags/v3.11.6:8b6ee5b, Oct 2 2023, 14:57:12) [MSC v.1935 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.