NVIDIA / nvidia-docker

Build and run Docker containers leveraging NVIDIA GPUs
Apache License 2.0
17.16k stars 2.03k forks source link

Docker issue: error response from daemon: OCI runtime create failed #1121

Closed convneato closed 4 years ago

convneato commented 4 years ago

1. Issue or feature description

I'm trying to run a deep learning docker image, deepo, here: https://github.com/ufoym/deepo#Installation

I've followed the initial steps to install docker, nvidia-docker, and then used the pull command to download all of the deepo files. In the next step they tell you to run the image with: "docker run --runtime=nvidia --rm ufoym/deepo nvidia-smi" In which case I get the following error:

docker: Error response from daemon: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v1.linux/moby/1038f5ace29cf151f9d9dcbb991037c88f6bf5bd533d3c0fada5fe32134ba828/log.json: no such file or directory): fork/exec /usr/bin/nvidia-container-runtime: no such file or directory: unknown.

I think my docker should be working fine, though. If i run 'docker run hello-world' I get the message that says my docker installation should be working correctly.

2. Steps to reproduce the issue

Typing docker run --runtime=nvidia --rm ufoym/deepo nvidia-smi into terminal

3. Information to [attach]

-- WARNING, the following logs are for debugging purposes only --

I1110 21:42:06.443020 6150 nvc.c:281] initializing library context (version=1.0.5, build=13b836390888f7b7c7dca115d16d7e28ab15a836) I1110 21:42:06.443113 6150 nvc.c:255] using root / I1110 21:42:06.443126 6150 nvc.c:256] using ldcache /etc/ld.so.cache I1110 21:42:06.443141 6150 nvc.c:257] using unprivileged user 1000:1000 W1110 21:42:06.445319 6151 nvc.c:186] failed to set inheritable capabilities W1110 21:42:06.445410 6151 nvc.c:187] skipping kernel modules load due to failure I1110 21:42:06.445936 6152 driver.c:133] starting driver service I1110 21:42:06.472082 6150 nvc_info.c:437] requesting driver information with '' I1110 21:42:06.472372 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.418.56 I1110 21:42:06.472405 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.418.56 I1110 21:42:06.472442 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.418.56 I1110 21:42:06.472474 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.418.56 I1110 21:42:06.472528 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.418.56 I1110 21:42:06.472569 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.418.56 I1110 21:42:06.472600 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.418.56 I1110 21:42:06.472638 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.418.56 I1110 21:42:06.472694 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.418.56 I1110 21:42:06.472712 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.418.56 I1110 21:42:06.472728 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.418.56 I1110 21:42:06.472744 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.418.56 I1110 21:42:06.472768 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.418.56 I1110 21:42:06.472785 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.418.56 I1110 21:42:06.472809 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.418.56 I1110 21:42:06.472827 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.418.56 I1110 21:42:06.472858 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.418.56 I1110 21:42:06.472896 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.418.56 I1110 21:42:06.473012 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.418.56 I1110 21:42:06.473085 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.418.56 I1110 21:42:06.473105 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.418.56 I1110 21:42:06.473158 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.418.56 I1110 21:42:06.473180 6150 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.418.56 I1110 21:42:06.473219 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-tls.so.418.56 I1110 21:42:06.473237 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-ptxjitcompiler.so.418.56 I1110 21:42:06.473264 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-opticalflow.so.418.56 I1110 21:42:06.473290 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-opencl.so.418.56 I1110 21:42:06.473308 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-ml.so.418.56 I1110 21:42:06.473333 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-ifr.so.418.56 I1110 21:42:06.473359 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-glvkspirv.so.418.56 I1110 21:42:06.473376 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-glsi.so.418.56 I1110 21:42:06.473395 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-glcore.so.418.56 I1110 21:42:06.473414 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-fbc.so.418.56 I1110 21:42:06.473440 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-fatbinaryloader.so.418.56 I1110 21:42:06.473459 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-encode.so.418.56 I1110 21:42:06.473483 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-eglcore.so.418.56 I1110 21:42:06.473502 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-compiler.so.418.56 I1110 21:42:06.473522 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvcuvid.so.418.56 I1110 21:42:06.473554 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libcuda.so.418.56 I1110 21:42:06.473585 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libGLX_nvidia.so.418.56 I1110 21:42:06.473604 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libGLESv2_nvidia.so.418.56 I1110 21:42:06.473623 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libGLESv1_CM_nvidia.so.418.56 I1110 21:42:06.473641 6150 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libEGL_nvidia.so.418.56 W1110 21:42:06.473653 6150 nvc_info.c:302] missing library libvdpau_nvidia.so W1110 21:42:06.473656 6150 nvc_info.c:306] missing compat32 library libnvidia-cfg.so W1110 21:42:06.473660 6150 nvc_info.c:306] missing compat32 library libvdpau_nvidia.so W1110 21:42:06.473665 6150 nvc_info.c:306] missing compat32 library libnvidia-rtcore.so W1110 21:42:06.473669 6150 nvc_info.c:306] missing compat32 library libnvoptix.so I1110 21:42:06.473921 6150 nvc_info.c:232] selecting /usr/bin/nvidia-smi I1110 21:42:06.473931 6150 nvc_info.c:232] selecting /usr/bin/nvidia-debugdump I1110 21:42:06.473942 6150 nvc_info.c:232] selecting /usr/bin/nvidia-persistenced I1110 21:42:06.473952 6150 nvc_info.c:232] selecting /usr/bin/nvidia-cuda-mps-control I1110 21:42:06.473961 6150 nvc_info.c:232] selecting /usr/bin/nvidia-cuda-mps-server I1110 21:42:06.473974 6150 nvc_info.c:369] listing device /dev/nvidiactl I1110 21:42:06.473977 6150 nvc_info.c:369] listing device /dev/nvidia-uvm I1110 21:42:06.473981 6150 nvc_info.c:369] listing device /dev/nvidia-uvm-tools I1110 21:42:06.473985 6150 nvc_info.c:369] listing device /dev/nvidia-modeset I1110 21:42:06.474001 6150 nvc_info.c:273] listing ipc /run/nvidia-persistenced/socket W1110 21:42:06.474010 6150 nvc_info.c:277] missing ipc /tmp/nvidia-mps I1110 21:42:06.474013 6150 nvc_info.c:493] requesting device information with '' I1110 21:42:06.479627 6150 nvc_info.c:523] listing device /dev/nvidia0 (GPU-f07a200a-0b51-e058-fb70-c707941ad92e at 00000000:01:00.0) NVRM version: 418.56 CUDA version: 10.1

Device Index: 0 Device Minor: 0 Model: GeForce RTX 2080 Ti Brand: GeForce GPU UUID: GPU-f07a200a-0b51-e058-fb70-c707941ad92e Bus Location: 00000000:01:00.0 Architecture: 7.5 I1110 21:42:06.479641 6150 nvc.c:318] shutting down library context I1110 21:42:06.479813 6152 driver.c:192] terminating driver service I1110 21:42:06.532079 6150 driver.c:233] driver service terminated successfully

Timestamp : Sun Nov 10 15:44:29 2019 Driver Version : 418.56 CUDA Version : 10.1

Attached GPUs : 1 GPU 00000000:01:00.0 Product Name : GeForce RTX 2080 Ti Product Brand : GeForce Display Mode : Enabled Display Active : Enabled Persistence Mode : Disabled Accounting Mode : Disabled Accounting Mode Buffer Size : 4000 Driver Model Current : N/A Pending : N/A Serial Number : N/A GPU UUID : GPU-f07a200a-0b51-e058-fb70-c707941ad92e Minor Number : 0 VBIOS Version : 90.02.0B.00.80 MultiGPU Board : No Board ID : 0x100 GPU Part Number : N/A Inforom Version Image Version : G001.0000.02.04 OEM Object : 1.1 ECC Object : N/A Power Management Object : N/A GPU Operation Mode Current : N/A Pending : N/A GPU Virtualization Mode Virtualization mode : None IBMNPU Relaxed Ordering Mode : N/A PCI Bus : 0x01 Device : 0x00 Domain : 0x0000 Device Id : 0x1E0710DE Bus Id : 00000000:01:00.0 Sub System Id : 0x86671043 GPU Link Info PCIe Generation Max : 3 Current : 1 Link Width Max : 16x Current : 16x Bridge Chip Type : N/A Firmware : N/A Replays Since Reset : 0 Replay Number Rollovers : 0 Tx Throughput : 5000 KB/s Rx Throughput : 16000 KB/s Fan Speed : 0 % Performance State : P8 Clocks Throttle Reasons Idle : Active Applications Clocks Setting : Not Active SW Power Cap : Not Active HW Slowdown : Not Active HW Thermal Slowdown : Not Active HW Power Brake Slowdown : Not Active Sync Boost : Not Active SW Thermal Slowdown : Not Active Display Clock Setting : Not Active FB Memory Usage Total : 10986 MiB Used : 462 MiB Free : 10524 MiB BAR1 Memory Usage Total : 256 MiB Used : 9 MiB Free : 247 MiB Compute Mode : Default Utilization Gpu : 2 % Memory : 2 % Encoder : 0 % Decoder : 0 % Encoder Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 FBC Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 Ecc Mode Current : N/A Pending : N/A ECC Errors Volatile SRAM Correctable : N/A SRAM Uncorrectable : N/A DRAM Correctable : N/A DRAM Uncorrectable : N/A Aggregate SRAM Correctable : N/A SRAM Uncorrectable : N/A DRAM Correctable : N/A DRAM Uncorrectable : N/A Retired Pages Single Bit ECC : N/A Double Bit ECC : N/A Pending : N/A Temperature GPU Current Temp : 53 C GPU Shutdown Temp : 94 C GPU Slowdown Temp : 91 C GPU Max Operating Temp : 89 C Memory Current Temp : N/A Memory Max Operating Temp : N/A Power Readings Power Management : Supported Power Draw : 40.30 W Power Limit : 260.00 W Default Power Limit : 260.00 W Enforced Power Limit : 260.00 W Min Power Limit : 100.00 W Max Power Limit : 312.00 W Clocks Graphics : 330 MHz SM : 330 MHz Memory : 405 MHz Video : 540 MHz Applications Clocks Graphics : N/A Memory : N/A Default Applications Clocks Graphics : N/A Memory : N/A Max Clocks Graphics : 2160 MHz SM : 2160 MHz Memory : 7000 MHz Video : 1950 MHz Max Customer Boost Clocks Graphics : N/A Clock Policy Auto Boost : N/A Auto Boost Default : N/A Processes Process ID : 1220 Type : G Name : /usr/lib/xorg/Xorg Used GPU Memory : 26 MiB Process ID : 1291 Type : G Name : /usr/bin/gnome-shell Used GPU Memory : 58 MiB Process ID : 2162 Type : G Name : /usr/lib/xorg/Xorg Used GPU Memory : 250 MiB Process ID : 2293 Type : G Name : /usr/bin/gnome-shell Used GPU Memory : 118 MiB Process ID : 5777 Type : G Name : /usr/lib/firefox/firefox Used GPU Memory : 6 MiB

Server: Docker Engine - Community Engine: Version: 19.03.4 API version: 1.40 (minimum version 1.12) Go version: go1.12.10 Git commit: 9013bf583a Built: Fri Oct 18 15:52:40 2019 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.2.10 GitCommit: b34a5c8af56e510852c35414db4c1f4fa6172339 runc: Version: 1.0.0-rc8+dev GitCommit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657 docker-init: Version: 0.18.0 GitCommit: fec3683

RenaudWasTaken commented 4 years ago

This error indicates that the nvidia-container-runtime binary is not on your machine. I would suggest you try to remove the nvidia-docker2 and nvidia-container-runtime packages and install them again.

Hope this helps!