Atinoda / text-generation-webui-docker

Docker variants of oobabooga's text-generation-webui, including pre-built images.
GNU Affero General Public License v3.0
395 stars 77 forks source link

Error response from daemon: could not select device driver "nvidia" with capabilities: #29

Closed dewijones92 closed 8 months ago

dewijones92 commented 1 year ago

no luck for me when trying to use this. Am I missing something? thanks

(base) dewi@DewiJones:~/code/text-generation-webui-docker/text-generation-webui-docker$ gs
++ pwd
+ current_dir=/home/dewi/code/text-generation-webui-docker/text-generation-webui-docker
+ [[ /home/dewi/code/text-generation-webui-docker/text-generation-webui-docker == \/\m\n\t\/\c* ]]
+ /usr/bin/git status -v -v
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   docker-compose.yml

--------------------------------------------------
Changes not staged for commit:
diff --git i/docker-compose.yml w/docker-compose.yml
index d1caff0..33dda5e 100644
--- i/docker-compose.yml
+++ w/docker-compose.yml
@@ -1,7 +1,7 @@
 version: "3"
 services:
   text-generation-webui-docker:
-    image: atinoda/text-generation-webui:default # Specify variant as the :tag
+    image: atinoda/text-generation-webui:llama-cpu # Specify variant as the :tag
     container_name: text-generation-webui
     environment:
       - EXTRA_LAUNCH_ARGS="--listen --verbose" # Custom launch args (e.g., --model MODEL_NAME)
no changes added to commit (use "git add" and/or "git commit -a")
git status
commit d4b58daffec5096e2a7057388420e74987537766 (HEAD -> master, origin/master, origin/HEAD)
Author: Atinoda <61033436+Atinoda@users.noreply.github.com>
Date:   Wed Oct 18 15:49:48 2023 +0100

    Separate nightly builds
(base) dewi@DewiJones:~/code/text-generation-webui-docker/text-generation-webui-docker$ docker compose up
Attaching to text-generation-webui
Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]
(base) dewi@DewiJones:~/code/text-generation-webui-docker/text-generation-webui-docker$
Atinoda commented 1 year ago

Are you able to run other docker images that require CUDA? Error message seems to say that you cannot access the GPU hardware.

Atinoda commented 1 year ago

I just noticed that you are trying to run the llama-cpu variant, please see #9 and #16 for relevant information. I will leave this open as a reminder for me to update the documentation with expanded instructions for CPU inference.

TLDR: Comment out the deploy: block in the docker-compose.yml

shaiksuhel1999 commented 11 months ago

Hi, In my case also getting the same error when I'm trying to run the docker container using the below command

'docker run --gpus all image-id'

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Basically I have Created VM using the default Amzon AMI which is verified by Amazon

These are AMI details

GPU (Kernel 4.14) AMI name: amzn2-ami-ecs-gpu-hvm-2.0.20231103-x86_64-ebs ECS Agent version: 1.79.0 Docker version: 20.10.25 Containerd version: 1.6.19 NVIDIA driver version: 535.54.03 CUDA version: 12.2.0 Source AMI name: amzn2-ami-minimal-hvm-2.0.20230926.0-x86_64-ebs

I'm using the below commands to erase the old nvidia-driver (535.54.03)and trying to install new nvidia-driver(535.129.03) version with below commands which are given in aws documentation

sudo yum remove nvidia sudo yum remove cuda sudo yum erase nvidia cuda sudo yum update -y sudo amazon-linux-extras install kernel-5.15 sudo yum install gcc make && sudo yum update -y sudo reboot sudo yum install -y gcc kernel-devel-$(uname -r) chmod +x NVIDIA-Linux-x86_64.run sudo CC=/usr/bin/gcc10-cc ./NVIDIA-Linux-x86_64.run sudo touch /etc/modprobe.d/nvidia.conf echo "options nvidia NVreg_EnableGpuFirmware=0" | sudo tee --append /etc/modprobe.d/nvidia.conf sudo reboot

After following the Above commands I'm able to upgrade nvidia-driver version to 535.129.03 And kernel also I'm able to upgrade to 5.15, But when I'm Running docker container facing the above mentioned issue.

Any Suggestions?

FKouhai commented 11 months ago

@shaiksuhel1999 you need to install nvidia-ctk and nvidia-container-runtime if the first package doesnt come with it, your docker daemon.json you need to put in the following

{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}
Atinoda commented 8 months ago

Closing this issue because the docker-compose.yml now has a comment indicating that the deploy: section should be commented out for non-Nvidia inferencing.