Closed shneeba closed 6 months ago
@shneeba- thrilled that it is up and running (and FAST!) .... I had to use the exact same line to install torch -pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 - will add to the README ... (BTW, if you know a better way to validate that the local Python interpreter is connected to the CUDA drivers, we can switch from the pytorch check - but that was the best way that I could find ...)
Thanks for getting that added in! I think using the pytorch check for now is a good enough validation but I'll have a think. I'll close this issue now.
As discussed in this issue it appears the GGUF models are not utilising the GPU.
Environment setup:
GPU
RTX 3090TI Driver Version (Nvidia Contol Panel) -
551.76
Driver Version (Device Manager) -31.0.15.5176
CUDA version:OS
Windows 10 Pro Version -
22H2
Build version -19045.4046
Windows Feature Experience Pack -1000.19053.1000.0
Python -3.11.0
I was actually looking through the
improving gguf cuda exception handling
pull request and noticed the mention of usingnvidia-smi
to get the version info. I nearly went down a rabbit hole thinking I had some old12.4
versions lying around but this seems to be for what compatibility version it supports:I then noticed:
This was the smoking gun I needed. Checking
torch.cuda.is_available()
returnedFalse
, I needed to run the following command (got from here):pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
As soon as I did this I was full steam ahead:
You weren't lying about that speed difference, it's seriously quick!
It may be worth having something in the README about installing
torch
via this method for other Windows users.Thanks again for your help @doberst. Very much appreciated!
I'm happy for you to close this issue straight away (user error after all) but thought you may like to know the root cause.