NVIDIA / nvidia-docker

Build and run Docker containers leveraging NVIDIA GPUs
Apache License 2.0
17.25k stars 2.03k forks source link

Cannot install nvidia-docker2 [unmet dependencies] #1683

Closed junwang-wish closed 11 months ago

junwang-wish commented 2 years ago

I can use docker but cannot install nvidia-docker2

(py36) junwang@dgxone01:~/utils$ sudo apt-get update
Hit:1 http://packages.treasuredata.com/3/ubuntu/xenial xenial InRelease
Hit:2 http://international.download.nvidia.com/dgx/repos xenial InRelease
Hit:3 https://nvidia.github.io/libnvidia-container/ubuntu16.04/amd64  InRelease
Get:4 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64  InRelease [1,481 B]     
Hit:5 https://nvidia.github.io/libnvidia-container/stable/ubuntu16.04/amd64  InRelease                                               
Hit:6 https://download.docker.com/linux/ubuntu bionic InRelease                                    
Hit:7 https://download.docker.com/linux/ubuntu xenial InRelease                                    
Get:8 http://security.ubuntu.com/ubuntu xenial-security InRelease [99.8 kB]                                                                            
Hit:9 http://archive.ubuntu.com/ubuntu xenial InRelease                                                                     
Get:10 http://archive.ubuntu.com/ubuntu xenial-updates InRelease [99.8 kB]                                                             
Hit:11 http://ppa.launchpad.net/george-edison55/cmake-3.x/ubuntu xenial InRelease            
Hit:12 http://ppa.launchpad.net/git-core/ppa/ubuntu xenial InRelease                                                                                  
Fetched 201 kB in 1s (150 kB/s)                                  
Reading package lists... Done
N: Ignoring file 'nvidia-container-toolkit.list.bkup' in directory '/etc/apt/sources.list.d/' as it has an invalid filename extension
N: Ignoring file 'nvidia-container-runtime.list.bkup' in directory '/etc/apt/sources.list.d/' as it has an invalid filename extension
N: Ignoring file 'libnvidia-container.list.bkup' in directory '/etc/apt/sources.list.d/' as it has an invalid filename extension
N: Ignoring file 'nvidia-docker.list.bkup' in directory '/etc/apt/sources.list.d/' as it has an invalid filename extension
N: Skipping acquire of configured file 'contrib/binary-i386/Packages' as repository 'http://packages.treasuredata.com/3/ubuntu/xenial xenial InRelease' doesn't support architecture 'i386'
(py36) junwang@dgxone01:~/utils$ sudo apt-get install -y nvidia-docker2
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 nvidia-docker2 : Depends: nvidia-container-runtime (= 2.0.0+docker18.09.2-1) but it is not going to be installed
                  Depends: docker-ce (= 5:18.09.2~3-0~ubuntu-xenial) but 5:19.03.11~3-0~ubuntu-xenial is to be installed or
                           docker-ee (= 5:18.09.2~3-0~ubuntu-xenial) but it is not installable
N: Ignoring file 'nvidia-container-toolkit.list.bkup' in directory '/etc/apt/sources.list.d/' as it has an invalid filename extension
N: Ignoring file 'nvidia-container-runtime.list.bkup' in directory '/etc/apt/sources.list.d/' as it has an invalid filename extension
N: Ignoring file 'libnvidia-container.list.bkup' in directory '/etc/apt/sources.list.d/' as it has an invalid filename extension
N: Ignoring file 'nvidia-docker.list.bkup' in directory '/etc/apt/sources.list.d/' as it has an invalid filename extension
E: Unable to correct problems, you have held broken packages.
elezar commented 2 years ago

@junwang-wish could you list which versions of the nvidia-docker2 package are available? (sudo apt list -a)

You could also try the following:

sudo apt-get install nvidia-docker2=2.11.0-1 nvidia-container-toolkit=1.11.0-1 libnvidia-container-tools=1.11.0-1 libnvidia-container1=1.11.0-1

to install the latest released versions explicitly.

The output seems to indicate that quite an old version is being picked up so it may be that there is an issue with your repository lists or that there is a different repository providing the packages.

junwang-wish commented 2 years ago

Thanks @elezar I cannot install cuda11 due to my driver being 384.183. I tried to install new driver via NVIDIA Driver Container by following official wiki but failed: https://github.com/NVIDIA/nvidia-container-toolkit/issues/184.

Also, my instance has an existing but older version of Nvidia docker, and modifying /etc/nvidia-container-runtime/config.toml as documented in Driver container wiki makes my original nvidia docker unusable with error: nvidia-container-cli: initialization error: change root failed: no such file or directory. I had to undo the changes suggested in the Driver container wiki to be able to run Nvidia docker.

elezar commented 11 months ago

With the v1.14.0 release we have reworked our packaging significantly. Please see the updated instructions here.

If there are still problems, please open an issue against https://github.com/NVIDIA/nvidia-container-toolkit