NVIDIA / nvidia-docker

Build and run Docker containers leveraging NVIDIA GPUs
Apache License 2.0
17.25k stars 2.03k forks source link

libvdpau_nvidia.so: no such file or directory: unknown. #1741

Closed gamdwk closed 1 year ago

gamdwk commented 1 year ago

when I run

sudo docker run --rm --gpus all --name cu nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu20.04

it reports

docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: detection error: open failed: /usr/lib/x86_64-linux-gnu/libvdpau_nvidia.so: no such file or directory: unknown.

My Docker version is 20.10.23, system is Unbuntu 20.04,NVIDIA Container Toolkit CLI version 1.12.1

nvidia-smi:

+-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.86.01 Driver Version: 515.86.01 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A100 80G... On | 00000000:31:00.0 Off | 0 | | N/A 48C P0 48W / 300W | 0MiB / 81920MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA A800 80G... On | 00000000:4B:00.0 Off | 0 | | N/A 52C P0 53W / 300W | 0MiB / 81920MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

nvidia-container-cli -k -d /dev/tty info

-- WARNING, the following logs are for debugging purposes only --

I0317 17:09:49.451605 874016 nvc.c:376] initializing library context (version=1.12.1, build=7440a1ead8e4fe35edf1c973c73a662108b21a1f) I0317 17:09:49.451696 874016 nvc.c:350] using root / I0317 17:09:49.451712 874016 nvc.c:351] using ldcache /etc/ld.so.cache I0317 17:09:49.451730 874016 nvc.c:352] using unprivileged user 1004:1004 I0317 17:09:49.451777 874016 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL) I0317 17:09:49.452132 874016 nvc.c:395] dxcore initialization failed, continuing assuming a non-WSL environment W0317 17:09:49.462192 874017 nvc.c:273] failed to set inheritable capabilities W0317 17:09:49.462292 874017 nvc.c:274] skipping kernel modules load due to failure I0317 17:09:49.462699 874018 rpc.c:71] starting driver rpc service I0317 17:09:49.474763 874019 rpc.c:71] starting nvcgo rpc service I0317 17:09:49.476530 874016 nvc_info.c:796] requesting driver information with '' E0317 17:09:49.477865 874016 nvc_info.c:359] error looking up libraries nvidia-container-cli: detection error: open failed: /usr/lib/x86_64-linux-gnu/libvdpau_nvidia.so: no such file or directory I0317 17:09:49.477909 874016 nvc.c:434] shutting down library context I0317 17:09:49.477965 874019 rpc.c:95] terminating nvcgo rpc service I0317 17:09:49.478703 874016 rpc.c:135] nvcgo rpc service terminated successfully I0317 17:09:49.480788 874018 rpc.c:95] terminating driver rpc service I0317 17:09:49.480953 874016 rpc.c:135] driver rpc service terminated successfully

dpkg -l 'nvidia'

Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-==================================-===========================-============-==========================> un libgldispatch0-nvidia (no description available) un libnvidia-compute (no description available) rc libnvidia-compute-470:amd64 470.161.03-0ubuntu0.20.04.1 amd64 NVIDIA libcompute package ii libnvidia-compute-515:amd64 515.86.01-0ubuntu0.20.04.1 amd64 NVIDIA libcompute package rc libnvidia-compute-515-server:amd64 515.86.01-0ubuntu0.20.04.3 amd64 NVIDIA libcompute package rc libnvidia-compute-520:amd64 520.61.05-0ubuntu1 amd64 NVIDIA libcompute package ii libnvidia-container-tools 1.12.1-1 amd64 NVIDIA container runtime l> ii libnvidia-container1:amd64 1.12.1-1 amd64 NVIDIA container runtime l> un libnvidia-ml1 (no description available) un nvidia-384 (no description available) un nvidia-390 (no description available) ii nvidia-common 1:0.9.0~0.20.04.7 amd64 transitional package for u> un nvidia-compute-utils (no description available) ii nvidia-compute-utils-515 515.86.01-0ubuntu0.20.04.1 amd64 NVIDIA compute utilities un nvidia-container-runtime (no description available) un nvidia-container-runtime-hook (no description available) ii nvidia-container-toolkit 1.12.1-1 amd64 NVIDIA Container toolkit ii nvidia-container-toolkit-base 1.12.1-1 amd64 NVIDIA Container Toolkit B> ii nvidia-dkms-515 515.86.01-0ubuntu0.20.04.1 amd64 NVIDIA DKMS package un nvidia-dkms-kernel (no description available) un nvidia-docker (no description available) ii nvidia-docker2 2.12.0-1 all nvidia-docker CLI wrapper un nvidia-driver-515 (no description available) un nvidia-kernel-common (no description available) ii nvidia-kernel-common-515 515.86.01-0ubuntu0.20.04.1 amd64 Shared files used with the> un nvidia-kernel-source (no description available) ii nvidia-kernel-source-515 515.86.01-0ubuntu0.20.04.1 amd64 NVIDIA kernel source packa> un nvidia-legacy-304xx-vdpau-driver (no description available) un nvidia-legacy-340xx-vdpau-driver (no description available) un nvidia-libopencl1-dev (no description available) un nvidia-opencl-icd (no description available) un nvidia-persistenced (no description available) rc nvidia-prime 0.8.16~0.20.04.2 all Tools to enable NVIDIA's P> rc nvidia-settings 470.57.01-0ubuntu0.20.04.3 amd64 Tool for configuring the N> un nvidia-settings-binary (no description available) un nvidia-smi (no description available) un nvidia-utils (no description available) ii nvidia-utils-515 515.86.01-0ubuntu0.20.04.1 amd64 NVIDIA driver support bina> un nvidia-vdpau-driver (no description available)

How to fix it?

guaguablue commented 1 year ago

hi,I met the same issue, if you got some ideas about how do fix it? Thanks!

cpuodzius commented 1 year ago

I have the same issue.

elezar commented 1 year ago

@cpuodzius @guaguablue please create a new issue under the nvidia-container-toolkit repository for your issue. It would be important to know whether the latest version (1.13.5) of the NVIDIA Container Toolkit shows the same behaviour. Information as to whether libvdpau_nvidia.so.* exists on your system would also be useful.

Wraythh commented 1 year ago

I have the same issue

forrestjgq commented 1 year ago

this happens usually after updating driver

you may try this on host:

  1. add a blank line to the end of /etc/ld.so.conf
  2. sudo ldconfig