Error running with CUDA

vprelovac commented 1 year ago

I am having problems running this with Nvidia 4090. Have been running other models/setups (outside of this repo) with GPU without problem

sudo ./run.sh --model code-7b --with-cuda

[+] Running 1/0 ✔ Container llama-gpt-llama-gpt-ui-1 Recreated 0.0s Attaching to llama-gpt-llama-gpt-api-cuda-gguf-1, llama-gpt-llama-gpt-ui-1 llama-gpt-llama-gpt-ui-1 | [INFO wait] -------------------------------------------------------- llama-gpt-llama-gpt-ui-1 | [INFO wait] docker-compose-wait 2.12.0 llama-gpt-llama-gpt-ui-1 | [INFO wait] --------------------------- llama-gpt-llama-gpt-ui-1 | [DEBUG wait] Starting with configuration: llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Hosts to be waiting for: [llama-gpt-api-cuda-gguf:8000] llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Paths to be waiting for: [] llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Timeout before failure: 3600 seconds llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - TCP connection timeout before retry: 5 seconds llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Sleeping time before checking for hosts/paths availability: 0 seconds llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Sleeping time once all hosts/paths are available: 0 seconds llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Sleeping time between retries: 1 seconds llama-gpt-llama-gpt-ui-1 | [DEBUG wait] -------------------------------------------------------- llama-gpt-llama-gpt-ui-1 | [INFO wait] Checking availability of host [llama-gpt-api-cuda-gguf:8000] Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]

debarko commented 1 year ago

Check a few things, if you are not having a CUDA capable Graphics card or not. If so then you are most likely not able to run the nvidia drivers.

Try running: sudo apt install nvidia-driver-535 nvidia-dkms-535 (This is the current number)

Follow this instruction set: Link

vprelovac commented 1 year ago

@debarko I think you missed the first two sentences of my bug report. Nevertheless thanks for asnwering, it seems it is not a common issue I will try to dig deeper.

vprelovac commented 1 year ago

@debarko I wasn't able to do anything. Sharing bit more debugging info in case it helps. It is a newly installed Ubuntu 2204 system. I've been running all the other models no problem (oobabooga, mlc etc.))

enochlev commented 1 year ago

Just to confirm I was able to run it on windows 11 with cuda, but it seems to use some GPU power and heavily CPU. I had to use git bash to run the run.sh command.

Cuda Version 12.2 Driver Version 537.13

7B model, 4090 @20 token/s

GreatNewHope commented 1 year ago

I think I found the solution. Most probably you don't have nvidia-docker installed.

sudo apt install -y nvidia-docker2
sudo systemctl daemon-reload
sudo systemctl restart docker

For Arch users, the package is called nvidia-docker

Found here

msameeh commented 1 year ago

I think I found the solution. Most probably you don't have nvidia-docker installed.
sudo apt install -y nvidia-docker2
sudo systemctl daemon-reload
sudo systemctl restart docker
For Arch users, the package is called nvidia-docker

Found here

For yum/dnf:

curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
  sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
sudo yum install -y nvidia-container-toolkit
sudo systemctl daemon-reload
sudo systemctl restart docker

Link

utrucceh commented 9 months ago

Latest version: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

getumbrel / llama-gpt

Error running with CUDA #83