Open vprelovac opened 1 year ago
Check a few things, if you are not having a CUDA capable Graphics card or not. If so then you are most likely not able to run the nvidia drivers.
Try running: sudo apt install nvidia-driver-535 nvidia-dkms-535
(This is the current number)
Follow this instruction set: Link
@debarko I think you missed the first two sentences of my bug report. Nevertheless thanks for asnwering, it seems it is not a common issue I will try to dig deeper.
@debarko I wasn't able to do anything. Sharing bit more debugging info in case it helps. It is a newly installed Ubuntu 2204 system. I've been running all the other models no problem (oobabooga, mlc etc.))
Just to confirm I was able to run it on windows 11 with cuda, but it seems to use some GPU power and heavily CPU. I had to use git bash to run the run.sh command.
Cuda Version 12.2 Driver Version 537.13
7B model, 4090 @20 token/s
I think I found the solution. Most probably you don't have nvidia-docker installed.
sudo apt install -y nvidia-docker2
sudo systemctl daemon-reload
sudo systemctl restart docker
For Arch users, the package is called nvidia-docker
I think I found the solution. Most probably you don't have nvidia-docker installed.
sudo apt install -y nvidia-docker2 sudo systemctl daemon-reload sudo systemctl restart docker
For Arch users, the package is called nvidia-docker
For yum/dnf:
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
sudo yum install -y nvidia-container-toolkit
sudo systemctl daemon-reload
sudo systemctl restart docker
I am having problems running this with Nvidia 4090. Have been running other models/setups (outside of this repo) with GPU without problem
sudo ./run.sh --model code-7b --with-cuda
[+] Running 1/0 ✔ Container llama-gpt-llama-gpt-ui-1 Recreated 0.0s Attaching to llama-gpt-llama-gpt-api-cuda-gguf-1, llama-gpt-llama-gpt-ui-1 llama-gpt-llama-gpt-ui-1 | [INFO wait] -------------------------------------------------------- llama-gpt-llama-gpt-ui-1 | [INFO wait] docker-compose-wait 2.12.0 llama-gpt-llama-gpt-ui-1 | [INFO wait] --------------------------- llama-gpt-llama-gpt-ui-1 | [DEBUG wait] Starting with configuration: llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Hosts to be waiting for: [llama-gpt-api-cuda-gguf:8000] llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Paths to be waiting for: [] llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Timeout before failure: 3600 seconds llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - TCP connection timeout before retry: 5 seconds llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Sleeping time before checking for hosts/paths availability: 0 seconds llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Sleeping time once all hosts/paths are available: 0 seconds llama-gpt-llama-gpt-ui-1 | [DEBUG wait] - Sleeping time between retries: 1 seconds llama-gpt-llama-gpt-ui-1 | [DEBUG wait] -------------------------------------------------------- llama-gpt-llama-gpt-ui-1 | [INFO wait] Checking availability of host [llama-gpt-api-cuda-gguf:8000] Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]