Open waldekkot opened 3 years ago
@waldekkot did you manage to solve it somehow? Thanks!
@iegorval unfortunately, there have been no changes to how it works...
We are currently in the processing of re-architecing the nvidia-docker
stack, and I'd be curious to know if this issue is resolved by the new stack.
Can you try replacing your current nvidia-container-runtime
binary with the "experimental" one from here:
docker cp $(docker create --rm nvcr.io/nvidia/k8s/container-toolkit:v1.8.0-rc.2-ubuntu18.04):/work/nvidia-container-runtime.experimental .
And then invoke docker using an NVIDIA_VISIBLE_DEVICES
envvar rather than the --gpus
flag.
A quick update to this in case @waldekkot has moved on:
Container toolkit version v1.8.0-rc.2-ubuntu18.04 as above is now the standard install via apt if you've configured the experimental packages repo. Using that version (or the file pulled from the container above) the problem still exists as detailed. There's (slightly) more info in the error message in that it now states "failed to add device rules":
docker -D run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 nvidia/cuda:11.0-base nvidia-smi
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook NVIDIA/nvidia-docker#1:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: failed to add device rules: write /sys/fs/cgroup/devices/docker/ca6bba3e85ac368ca5310907cbcd9b2fd404c83077323cd84b49a3b541019785/devices.allow: operation not permitted: unknown.
The error is identical whether you use ENVVARS or --gpus
as arguments.
Hmm, Assuming the 1.9.0-1 release is ahead of that it's still broken there. I confess I'm struggling to debug this, obviously happy to diagnose further if anyone can point me in the correct direction:
docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 nvidia/cuda:11.0-base nvidia-smi
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook NVIDIA/nvidia-docker#1:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: failed to add device rules: write /sys/fs/cgroup/devices/docker/f9352b6b081710baa40d6ba036102e79c228c063afc16ca88ef21212f02f0ad5/devices.allow: operation not permitted: unknown.
ERRO[0002] error waiting for container: context canceled
ii libnvidia-container-tools 1.9.0-1 amd64 NVIDIA container runtime library (command-line tools)
ii libnvidia-container1:amd64 1.9.0-1 amd64 NVIDIA container runtime library
ii nvidia-container-toolkit 1.9.0-1 amd64 NVIDIA container runtime hook
ii nvidia-docker2 2.10.0-1 all nvidia-docker CLI wrapper```
Hmm. So this works for you if you downgrade to say, libnvidia-container
v1.7.0? But it's broken on the latest?
Looking more closely at the linked issue, it seems that this is failing "by design" at the moment (and would also fail on older versions of libnvidia-container
not just the newest one).
That error should really be non-fatal in the case of nested containers. It may be worth filing an issue against nvidia-container to have them relax error handling on this particular case.
Unprivileged containers aren’t allowed to modify devices.allow/devices.deny but that doesn’t mean the device in question isn’t already allowed (as it is in this case).
I think what you want to do is probably just uncomment no-cgroups = true
in your /etc/nvidia-container-runtime/config.toml
file.
Excellent! Thank you @klueska, that fixed the issue there.
For reference nvidia-docker is still not working in unprivileged mode as above without some more work. It's necessary to set raw.apparmor values within LXC to allow access to /proc/driver/nvidia/gpus/0000:bu:s_id.0
as otherwise nvidia-container-cli fails to mount. That's very much an LXC thing rather than a nvidia-docker issue though.
Thanks again.
Hey @vsltimkay, I appreciate this is a fairly old thread but could you possibly share how you set the apparmor values to allow the container access to the /proc/driver/nvidia/gpus
directory? I've followed the other steps in the thread and currently all my container can see in /proc/driver/nvidia
is params registry version
.
Thanks!
Looking more closely at the linked issue, it seems that this is failing "by design" at the moment (and would also fail on older versions of
libnvidia-container
not just the newest one).That error should really be non-fatal in the case of nested containers. It may be worth filing an issue against nvidia-container to have them relax error handling on this particular case. Unprivileged containers aren’t allowed to modify devices.allow/devices.deny but that doesn’t mean the device in question isn’t already allowed (as it is in this case).
I think what you want to do is probably just uncomment
no-cgroups = true
in your/etc/nvidia-container-runtime/config.toml
file.
The workaround of setting no-cgroups = true
does not work with NVIDIA Container Toolkit v1.14.0. It works with v1.13.x.
Looking more closely at the linked issue, it seems that this is failing "by design" at the moment (and would also fail on older versions of
libnvidia-container
not just the newest one).That error should really be non-fatal in the case of nested containers. It may be worth filing an issue against nvidia-container to have them relax error handling on this particular case. Unprivileged containers aren’t allowed to modify devices.allow/devices.deny but that doesn’t mean the device in question isn’t already allowed (as it is in this case).
I think what you want to do is probably just uncomment
no-cgroups = true
in your/etc/nvidia-container-runtime/config.toml
file.The workaround of setting
no-cgroups = true
does not work with NVIDIA Container Toolkit v1.14.0. It works with v1.13.x.
@the729 there is a known issue in the 1.14.0
release related to applying config options from the config file. This has been resolved in the 1.14.1
release. Is that available to you?
@the729 there is a known issue in the
1.14.0
release related to applying config options from the config file. This has been resolved in the1.14.1
release. Is that available to you?
It works. Thank you.
1. Issue or feature description
I am trying to run a Nvidia/CUDA Docker container from within an LXD container (so, a nested scenario). It seems, the only way to get such Nvidia Docker container working is to make the LXD container a privileged one. So, inside the privileged LXD container, the following works perfectly fine:
docker run --rm --gpus all --ipc=host nvidia/cuda:11.4.1-base-ubuntu20.04 nvidia-smi
If I run the very same LXD container as unprivileged, the nested CUDA Docker container fails with the error below. Other (non nvidia/CUDA) Docker containers work fine.
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: write error: /sys/fs/cgroup/devices/docker/098ad8bf1fdcf4ab72091864933fbc8b67a8f0b30746681ba6ef4082c23245b9/devices.allow: operation not permitted: unknown.
On the LXD discussion group, it was suggested to make the error as "non-fatal" in case of nested containers: https://discuss.linuxcontainers.org/t/nvidia-and-docker-in-lxd/12136
2. Steps to reproduce the issue
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| +-----------------------------------------------------------------------------+
apt install apt-transport-https ca-certificates curl gnupg lsb-release -y curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null apt-get update apt-get install docker-ce docker-ce-cli containerd.io
docker run --rm hello-world
Hello from Docker! This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
To try something more ambitious, you can run an Ubuntu container with: $ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID: https://hub.docker.com/
For more examples and ideas, visit: https://docs.docker.com/get-started/
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - distribution=ubuntu20.04 && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | tee /etc/apt/sources.list.d/nvidia-docker.list apt update apt install nvidia-docker2 -y
systemctl restart docker
docker version
Client: Docker Engine - Community Version: 20.10.8 API version: 1.41 Go version: go1.16.6 Git commit: 3967b7d Built: Fri Jul 30 19:53:57 2021 OS/Arch: linux/amd64 Context: default Experimental: true
Server: Docker Engine - Community Engine: Version: 20.10.8 API version: 1.41 (minimum version 1.12) Go version: go1.16.6 Git commit: 75249d8 Built: Fri Jul 30 19:52:06 2021 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.4.9 GitCommit: e25210fe30a0a703442421b0f60afac609f950a3 runc: Version: 1.0.1 GitCommit: v1.0.1-0-g4144b63 docker-init: Version: 0.19.0 GitCommit: de40ad0
docker info
Client: Context: default Debug Mode: false Plugins: app: Docker App (Docker Inc., v0.9.1-beta3) buildx: Build with BuildKit (Docker Inc., v0.6.1-docker) scan: Docker Scan (Docker Inc., v0.8.0)
Server: Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 1 Server Version: 20.10.8 Storage Driver: btrfs Build Version: Btrfs v5.10.1 Library Version: 102 Logging Driver: json-file Cgroup Driver: cgroupfs Cgroup Version: 1 Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux nvidia runc Default Runtime: runc Init Binary: docker-init containerd version: e25210fe30a0a703442421b0f60afac609f950a3 runc version: v1.0.1-0-g4144b63 init version: de40ad0 Security Options: apparmor seccomp Profile: default Kernel Version: 5.11.0-34-generic Operating System: Ubuntu 21.04 OSType: linux Architecture: x86_64 CPUs: 8 Total Memory: 62.75GiB Name: demo4 ID: PTI6:4Q7T:PMWD:XC2L:5W2X:PMLV:7QRG:S3ZW:KMII:GCAY:PC7L:5P3X Docker Root Dir: /var/lib/docker Debug Mode: false Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false
docker run --rm --gpus all --ipc=host nvidia/cuda:11.4.1-base-ubuntu20.04 nvidia-smi
Unable to find image 'nvidia/cuda:11.4.1-base-ubuntu20.04' locally 11.4.1-base-ubuntu20.04: Pulling from nvidia/cuda 16ec32c2132b: Pull complete d795373d028a: Pull complete aa1a4de63ca7: Pull complete 99fe2b653f7a: Pull complete 151e201e5dbc: Pull complete Digest: sha256:79b4fdc93e6e98fbb1770893b497d6528ab19cf056d15e366787135ca18b7565 Status: Downloaded newer image for nvidia/cuda:11.4.1-base-ubuntu20.04 docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: write error: /sys/fs/cgroup/devices/docker/333969e7089a6ca8b93c493b34741c8e17d8d6fb5acaa16031c4a8fb54814286/devices.allow: operation not permitted: unknown.
exit lxc stop demo3 lxc config set demo3 security.privileged=true lxc start demo3
lxc exec demo3 -- bash docker run --rm --gpus all --ipc=host nvidia/cuda:11.4.1-base-ubuntu20.04 nvidia-smi
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.63.01 Driver Version: 470.63.01 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A | | 16% 28C P8 16W / 250W | 178MiB / 11175MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| +-----------------------------------------------------------------------------+
nvidia-container-cli -k -d /dev/tty info
-- WARNING, the following logs are for debugging purposes only --
I0913 20:19:32.954928 591 nvc.c:372] initializing library context (version=1.5.0, build=4699c1b8b4991b6d869ea403e109291653bb040b) I0913 20:19:32.955339 591 nvc.c:346] using root / I0913 20:19:32.955386 591 nvc.c:347] using ldcache /etc/ld.so.cache I0913 20:19:32.955422 591 nvc.c:348] using unprivileged user 65534:65534 I0913 20:19:32.955509 591 nvc.c:389] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL) I0913 20:19:32.956299 591 nvc.c:391] dxcore initialization failed, continuing assuming a non-WSL environment W0913 20:19:32.956416 591 nvc.c:249] skipping kernel modules load due to user namespace I0913 20:19:32.956870 592 driver.c:101] starting driver service I0913 20:19:32.958866 591 nvc_info.c:750] requesting driver information with '' I0913 20:19:32.959672 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.470.63.01 I0913 20:19:32.959733 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.470.63.01 I0913 20:19:32.959775 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.470.63.01 I0913 20:19:32.959809 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.470.63.01 I0913 20:19:32.959857 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.470.63.01 I0913 20:19:32.959900 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.470.63.01 I0913 20:19:32.959931 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.470.63.01 I0913 20:19:32.959965 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.470.63.01 I0913 20:19:32.960007 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.470.63.01 I0913 20:19:32.960053 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.470.63.01 I0913 20:19:32.960081 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.470.63.01 I0913 20:19:32.960115 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.470.63.01 I0913 20:19:32.960144 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.470.63.01 I0913 20:19:32.960189 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.470.63.01 I0913 20:19:32.960231 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.470.63.01 I0913 20:19:32.960260 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.470.63.01 I0913 20:19:32.960293 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.470.63.01 I0913 20:19:32.960335 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cbl.so.470.63.01 I0913 20:19:32.960368 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.470.63.01 I0913 20:19:32.960414 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.470.63.01 I0913 20:19:32.960497 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.470.63.01 I0913 20:19:32.960557 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.470.63.01 I0913 20:19:32.960590 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.470.63.01 I0913 20:19:32.960617 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.470.63.01 I0913 20:19:32.960645 591 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.470.63.01 W0913 20:19:32.960660 591 nvc_info.c:392] missing library libnvidia-nscq.so W0913 20:19:32.960665 591 nvc_info.c:392] missing library libnvidia-fatbinaryloader.so W0913 20:19:32.960669 591 nvc_info.c:392] missing library libvdpau_nvidia.so W0913 20:19:32.960674 591 nvc_info.c:396] missing compat32 library libnvidia-ml.so W0913 20:19:32.960678 591 nvc_info.c:396] missing compat32 library libnvidia-cfg.so W0913 20:19:32.960682 591 nvc_info.c:396] missing compat32 library libnvidia-nscq.so W0913 20:19:32.960687 591 nvc_info.c:396] missing compat32 library libcuda.so W0913 20:19:32.960691 591 nvc_info.c:396] missing compat32 library libnvidia-opencl.so W0913 20:19:32.960695 591 nvc_info.c:396] missing compat32 library libnvidia-ptxjitcompiler.so W0913 20:19:32.960700 591 nvc_info.c:396] missing compat32 library libnvidia-fatbinaryloader.so W0913 20:19:32.960704 591 nvc_info.c:396] missing compat32 library libnvidia-allocator.so W0913 20:19:32.960708 591 nvc_info.c:396] missing compat32 library libnvidia-compiler.so W0913 20:19:32.960714 591 nvc_info.c:396] missing compat32 library libnvidia-ngx.so W0913 20:19:32.960719 591 nvc_info.c:396] missing compat32 library libvdpau_nvidia.so W0913 20:19:32.960724 591 nvc_info.c:396] missing compat32 library libnvidia-encode.so W0913 20:19:32.960727 591 nvc_info.c:396] missing compat32 library libnvidia-opticalflow.so W0913 20:19:32.960731 591 nvc_info.c:396] missing compat32 library libnvcuvid.so W0913 20:19:32.960735 591 nvc_info.c:396] missing compat32 library libnvidia-eglcore.so W0913 20:19:32.960739 591 nvc_info.c:396] missing compat32 library libnvidia-glcore.so W0913 20:19:32.960744 591 nvc_info.c:396] missing compat32 library libnvidia-tls.so W0913 20:19:32.960749 591 nvc_info.c:396] missing compat32 library libnvidia-glsi.so W0913 20:19:32.960753 591 nvc_info.c:396] missing compat32 library libnvidia-fbc.so W0913 20:19:32.960757 591 nvc_info.c:396] missing compat32 library libnvidia-ifr.so W0913 20:19:32.960762 591 nvc_info.c:396] missing compat32 library libnvidia-rtcore.so W0913 20:19:32.960765 591 nvc_info.c:396] missing compat32 library libnvoptix.so W0913 20:19:32.960769 591 nvc_info.c:396] missing compat32 library libGLX_nvidia.so W0913 20:19:32.960773 591 nvc_info.c:396] missing compat32 library libEGL_nvidia.so W0913 20:19:32.960778 591 nvc_info.c:396] missing compat32 library libGLESv2_nvidia.so W0913 20:19:32.960783 591 nvc_info.c:396] missing compat32 library libGLESv1_CM_nvidia.so W0913 20:19:32.960788 591 nvc_info.c:396] missing compat32 library libnvidia-glvkspirv.so W0913 20:19:32.960792 591 nvc_info.c:396] missing compat32 library libnvidia-cbl.so I0913 20:19:32.961010 591 nvc_info.c:297] selecting /usr/bin/nvidia-smi I0913 20:19:32.961030 591 nvc_info.c:297] selecting /usr/bin/nvidia-debugdump I0913 20:19:32.961046 591 nvc_info.c:297] selecting /usr/bin/nvidia-persistenced I0913 20:19:32.961072 591 nvc_info.c:297] selecting /usr/bin/nvidia-cuda-mps-control I0913 20:19:32.961092 591 nvc_info.c:297] selecting /usr/bin/nvidia-cuda-mps-server W0913 20:19:32.961139 591 nvc_info.c:418] missing binary nv-fabricmanager I0913 20:19:32.961177 591 nvc_info.c:512] listing device /dev/nvidiactl I0913 20:19:32.961184 591 nvc_info.c:512] listing device /dev/nvidia-uvm I0913 20:19:32.961191 591 nvc_info.c:512] listing device /dev/nvidia-uvm-tools I0913 20:19:32.961196 591 nvc_info.c:512] listing device /dev/nvidia-modeset W0913 20:19:32.961223 591 nvc_info.c:342] missing ipc /var/run/nvidia-persistenced/socket W0913 20:19:32.961247 591 nvc_info.c:342] missing ipc /var/run/nvidia-fabricmanager/socket W0913 20:19:32.961264 591 nvc_info.c:342] missing ipc /tmp/nvidia-mps I0913 20:19:32.961270 591 nvc_info.c:805] requesting device information with '' I0913 20:19:32.966964 591 nvc_info.c:697] listing device /dev/nvidia0 (GPU-06986d8e-47c3-467c-c6bc-0a30ae3fbd30 at 00000000:01:00.0) NVRM version: 470.63.01 CUDA version: 11.4
Device Index: 0 Device Minor: 0 Model: NVIDIA GeForce GTX 1080 Ti Brand: GeForce GPU UUID: GPU-06986d8e-47c3-467c-c6bc-0a30ae3fbd30 Bus Location: 00000000:01:00.0 Architecture: 6.1 I0913 20:19:32.966997 591 nvc.c:423] shutting down library context I0913 20:19:32.967215 592 driver.c:163] terminating driver service I0913 20:19:32.967500 591 driver.c:203] driver service terminated successfully
Linux demo4 5.11.0-34-generic NVIDIA/nvidia-docker#36-Ubuntu SMP Thu Aug 26 19:22:09 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
[ 1517.263463] docker0: port 1(vethce5fe56) entered blocking state [ 1517.263475] docker0: port 1(vethce5fe56) entered disabled state [ 1517.263652] device vethce5fe56 entered promiscuous mode [ 1517.622529] docker0: port 1(vethce5fe56) entered disabled state [ 1517.628590] device vethce5fe56 left promiscuous mode [ 1517.628603] docker0: port 1(vethce5fe56) entered disabled state
from the host
==============NVSMI LOG==============
Timestamp : Mon Sep 13 22:23:14 2021 Driver Version : 470.63.01 CUDA Version : 11.4
Attached GPUs : 1 GPU 00000000:01:00.0 Product Name : NVIDIA GeForce GTX 1080 Ti Product Brand : GeForce Display Mode : Enabled Display Active : Disabled Persistence Mode : Disabled MIG Mode Current : N/A Pending : N/A Accounting Mode : Disabled Accounting Mode Buffer Size : 4000 Driver Model Current : N/A Pending : N/A Serial Number : N/A GPU UUID : GPU-06986d8e-47c3-467c-c6bc-0a30ae3fbd30 Minor Number : 0 VBIOS Version : 86.02.39.00.FF MultiGPU Board : No Board ID : 0x100 GPU Part Number : N/A Module ID : 0 Inforom Version Image Version : G001.0000.01.04 OEM Object : 1.1 ECC Object : N/A Power Management Object : N/A GPU Operation Mode Current : N/A Pending : N/A GSP Firmware Version : N/A GPU Virtualization Mode Virtualization Mode : None Host VGPU Mode : N/A IBMNPU Relaxed Ordering Mode : N/A PCI Bus : 0x01 Device : 0x00 Domain : 0x0000 Device Id : 0x1B0610DE Bus Id : 00000000:01:00.0 Sub System Id : 0x376A1458 GPU Link Info PCIe Generation Max : 3 Current : 1 Link Width Max : 16x Current : 16x Bridge Chip Type : N/A Firmware : N/A Replays Since Reset : 0 Replay Number Rollovers : 0 Tx Throughput : 0 KB/s Rx Throughput : 0 KB/s Fan Speed : 16 % Performance State : P8 Clocks Throttle Reasons Idle : Active Applications Clocks Setting : Not Active SW Power Cap : Not Active HW Slowdown : Not Active HW Thermal Slowdown : Not Active HW Power Brake Slowdown : Not Active Sync Boost : Not Active SW Thermal Slowdown : Not Active Display Clock Setting : Not Active FB Memory Usage Total : 11175 MiB Used : 178 MiB Free : 10997 MiB BAR1 Memory Usage Total : 256 MiB Used : 5 MiB Free : 251 MiB Compute Mode : Default Utilization Gpu : 0 % Memory : 0 % Encoder : 0 % Decoder : 0 % Encoder Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 FBC Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 Ecc Mode Current : N/A Pending : N/A ECC Errors Volatile Single Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Texture Shared : N/A CBU : N/A Total : N/A Double Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Texture Shared : N/A CBU : N/A Total : N/A Aggregate Single Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Texture Shared : N/A CBU : N/A Total : N/A Double Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Texture Shared : N/A CBU : N/A Total : N/A Retired Pages Single Bit ECC : N/A Double Bit ECC : N/A Pending Page Blacklist : N/A Remapped Rows : N/A Temperature GPU Current Temp : 28 C GPU Shutdown Temp : 96 C GPU Slowdown Temp : 93 C GPU Max Operating Temp : N/A GPU Target Temperature : 84 C Memory Current Temp : N/A Memory Max Operating Temp : N/A Power Readings Power Management : Supported Power Draw : 16.48 W Power Limit : 250.00 W Default Power Limit : 250.00 W Enforced Power Limit : 250.00 W Min Power Limit : 125.00 W Max Power Limit : 375.00 W Clocks Graphics : 139 MHz SM : 139 MHz Memory : 405 MHz Video : 544 MHz Applications Clocks Graphics : N/A Memory : N/A Default Applications Clocks Graphics : N/A Memory : N/A Max Clocks Graphics : 2037 MHz SM : 2037 MHz Memory : 5616 MHz Video : 1620 MHz Max Customer Boost Clocks Graphics : N/A Clock Policy Auto Boost : N/A Auto Boost Default : N/A Voltage Graphics : N/A Processes GPU instance ID : N/A Compute instance ID : N/A Process ID : 5026 Type : G Name : /usr/lib/xorg/Xorg Used GPU Memory : 167 MiB GPU instance ID : N/A Compute instance ID : N/A Process ID : 5324 Type : G Name : /usr/bin/gnome-shell Used GPU Memory : 8 MiB
from the LXD container:
==============NVSMI LOG==============
Timestamp : Mon Sep 13 20:23:59 2021 Driver Version : 470.63.01 CUDA Version : 11.4
Attached GPUs : 1 GPU 00000000:01:00.0 Product Name : NVIDIA GeForce GTX 1080 Ti Product Brand : GeForce Display Mode : Enabled Display Active : Disabled Persistence Mode : Disabled MIG Mode Current : N/A Pending : N/A Accounting Mode : Disabled Accounting Mode Buffer Size : 4000 Driver Model Current : N/A Pending : N/A Serial Number : N/A GPU UUID : GPU-06986d8e-47c3-467c-c6bc-0a30ae3fbd30 Minor Number : 0 VBIOS Version : 86.02.39.00.FF MultiGPU Board : No Board ID : 0x100 GPU Part Number : N/A Module ID : 0 Inforom Version Image Version : G001.0000.01.04 OEM Object : 1.1 ECC Object : N/A Power Management Object : N/A GPU Operation Mode Current : N/A Pending : N/A GSP Firmware Version : N/A GPU Virtualization Mode Virtualization Mode : None Host VGPU Mode : N/A IBMNPU Relaxed Ordering Mode : N/A PCI Bus : 0x01 Device : 0x00 Domain : 0x0000 Device Id : 0x1B0610DE Bus Id : 00000000:01:00.0 Sub System Id : 0x376A1458 GPU Link Info PCIe Generation Max : 3 Current : 1 Link Width Max : 16x Current : 16x Bridge Chip Type : N/A Firmware : N/A Replays Since Reset : 0 Replay Number Rollovers : 0 Tx Throughput : 0 KB/s Rx Throughput : 0 KB/s Fan Speed : 16 % Performance State : P8 Clocks Throttle Reasons Idle : Active Applications Clocks Setting : Not Active SW Power Cap : Not Active HW Slowdown : Not Active HW Thermal Slowdown : Not Active HW Power Brake Slowdown : Not Active Sync Boost : Not Active SW Thermal Slowdown : Not Active Display Clock Setting : Not Active FB Memory Usage Total : 11175 MiB Used : 178 MiB Free : 10997 MiB BAR1 Memory Usage Total : 256 MiB Used : 5 MiB Free : 251 MiB Compute Mode : Default Utilization Gpu : 0 % Memory : 0 % Encoder : 0 % Decoder : 0 % Encoder Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 FBC Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 Ecc Mode Current : N/A Pending : N/A ECC Errors Volatile Single Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Texture Shared : N/A CBU : N/A Total : N/A Double Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Texture Shared : N/A CBU : N/A Total : N/A Aggregate Single Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Texture Shared : N/A CBU : N/A Total : N/A Double Bit Device Memory : N/A Register File : N/A L1 Cache : N/A L2 Cache : N/A Texture Memory : N/A Texture Shared : N/A CBU : N/A Total : N/A Retired Pages Single Bit ECC : N/A Double Bit ECC : N/A Pending Page Blacklist : N/A Remapped Rows : N/A Temperature GPU Current Temp : 28 C GPU Shutdown Temp : 96 C GPU Slowdown Temp : 93 C GPU Max Operating Temp : N/A GPU Target Temperature : 84 C Memory Current Temp : N/A Memory Max Operating Temp : N/A Power Readings Power Management : Supported Power Draw : 17.30 W Power Limit : 250.00 W Default Power Limit : 250.00 W Enforced Power Limit : 250.00 W Min Power Limit : 125.00 W Max Power Limit : 375.00 W Clocks Graphics : 139 MHz SM : 139 MHz Memory : 405 MHz Video : 544 MHz Applications Clocks Graphics : N/A Memory : N/A Default Applications Clocks Graphics : N/A Memory : N/A Max Clocks Graphics : 2037 MHz SM : 2037 MHz Memory : 5616 MHz Video : 1620 MHz Max Customer Boost Clocks Graphics : N/A Clock Policy Auto Boost : N/A Auto Boost Default : N/A Voltage Graphics : N/A Processes : None
Client: Docker Engine - Community Version: 20.10.8 API version: 1.41 Go version: go1.16.6 Git commit: 3967b7d Built: Fri Jul 30 19:53:57 2021 OS/Arch: linux/amd64 Context: default Experimental: true
Server: Docker Engine - Community Engine: Version: 20.10.8 API version: 1.41 (minimum version 1.12) Go version: go1.16.6 Git commit: 75249d8 Built: Fri Jul 30 19:52:06 2021 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.4.9 GitCommit: e25210fe30a0a703442421b0f60afac609f950a3 runc: Version: 1.0.1 GitCommit: v1.0.1-0-g4144b63 docker-init: Version: 0.19.0 GitCommit: de40ad0
from the host:
Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-=============================-==========================-============-========================================================= un libgldispatch0-nvidia (no description available)
ii libnvidia-cfg1-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA binary OpenGL/GLX configuration library
un libnvidia-cfg1-any (no description available)
un libnvidia-common (no description available)
ii libnvidia-common-470 470.63.01-0ubuntu0.21.04.2 all Shared files used by the NVIDIA libraries
un libnvidia-compute (no description available)
rc libnvidia-compute-460:amd64 460.73.01-0ubuntu1 amd64 NVIDIA libcompute package
rc libnvidia-compute-465:amd64 465.19.01-0ubuntu1 amd64 NVIDIA libcompute package
ii libnvidia-compute-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA libcompute package
ii libnvidia-container-tools 1.3.3-1 amd64 NVIDIA container runtime library (command-line tools)
ii libnvidia-container1:amd64 1.3.3-1 amd64 NVIDIA container runtime library
un libnvidia-decode (no description available)
ii libnvidia-decode-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA Video Decoding runtime libraries
un libnvidia-encode (no description available)
ii libnvidia-encode-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVENC Video Encoding runtime library
un libnvidia-extra (no description available)
ii libnvidia-extra-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 Extra libraries for the NVIDIA driver
un libnvidia-fbc1 (no description available)
ii libnvidia-fbc1-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA OpenGL-based Framebuffer Capture runtime library
un libnvidia-gl (no description available)
ii libnvidia-gl-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
un libnvidia-ifr1 (no description available)
ii libnvidia-ifr1-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA OpenGL-based Inband Frame Readback runtime library
un libnvidia-ml1 (no description available)
un nvidia-384 (no description available)
un nvidia-390 (no description available)
un nvidia-common (no description available)
un nvidia-compute-utils (no description available)
rc nvidia-compute-utils-460 460.73.01-0ubuntu1 amd64 NVIDIA compute utilities
rc nvidia-compute-utils-465 465.19.01-0ubuntu1 amd64 NVIDIA compute utilities
ii nvidia-compute-utils-470 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA compute utilities
ii nvidia-container-runtime 3.4.2-1 amd64 NVIDIA container runtime
un nvidia-container-runtime-hook (no description available)
ii nvidia-container-toolkit 1.4.2-1 amd64 NVIDIA container runtime hook
rc nvidia-dkms-460 460.73.01-0ubuntu1 amd64 NVIDIA DKMS package
rc nvidia-dkms-465 465.19.01-0ubuntu1 amd64 NVIDIA DKMS package
ii nvidia-dkms-470 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA DKMS package
un nvidia-dkms-kernel (no description available)
un nvidia-docker (no description available)
ii nvidia-docker2 2.5.0-1 all nvidia-docker CLI wrapper
ii nvidia-driver-470 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA driver metapackage
un nvidia-driver-binary (no description available)
un nvidia-kernel-common (no description available)
rc nvidia-kernel-common-460 460.73.01-0ubuntu1 amd64 Shared files used with the kernel module
rc nvidia-kernel-common-465 465.19.01-0ubuntu1 amd64 Shared files used with the kernel module
ii nvidia-kernel-common-470 470.63.01-0ubuntu0.21.04.2 amd64 Shared files used with the kernel module
un nvidia-kernel-source (no description available)
un nvidia-kernel-source-460 (no description available)
un nvidia-kernel-source-465 (no description available)
ii nvidia-kernel-source-470 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA kernel source package
un nvidia-libopencl1-dev (no description available)
ii nvidia-modprobe 470.57.02-0ubuntu1 amd64 Load the NVIDIA kernel driver and create device files
un nvidia-opencl-icd (no description available)
un nvidia-persistenced (no description available)
ii nvidia-prime 0.8.16.1 all Tools to enable NVIDIA's Prime
ii nvidia-settings 470.57.02-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver
un nvidia-settings-binary (no description available)
un nvidia-smi (no description available)
un nvidia-utils (no description available)
ii nvidia-utils-470 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA driver support binaries
ii xserver-xorg-video-nvidia-470 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA binary Xorg driver
from within the LXD container:
Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-=============================-==========================-============-========================================================= un libgldispatch0-nvidia (no description available)
ii libnvidia-cfg1-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA binary OpenGL/GLX configuration library
un libnvidia-cfg1-any (no description available)
un libnvidia-common (no description available)
ii libnvidia-common-470 470.63.01-0ubuntu0.21.04.2 all Shared files used by the NVIDIA libraries
un libnvidia-compute (no description available)
ii libnvidia-compute-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA libcompute package
ii libnvidia-container-tools 1.5.0-1 amd64 NVIDIA container runtime library (command-line tools)
ii libnvidia-container1:amd64 1.5.0-1 amd64 NVIDIA container runtime library
un libnvidia-decode (no description available)
ii libnvidia-decode-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA Video Decoding runtime libraries
un libnvidia-encode (no description available)
ii libnvidia-encode-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVENC Video Encoding runtime library
un libnvidia-extra (no description available)
ii libnvidia-extra-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 Extra libraries for the NVIDIA driver
un libnvidia-fbc1 (no description available)
ii libnvidia-fbc1-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA OpenGL-based Framebuffer Capture runtime library
un libnvidia-gl (no description available)
ii libnvidia-gl-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
un libnvidia-ifr1 (no description available)
ii libnvidia-ifr1-470:amd64 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA OpenGL-based Inband Frame Readback runtime library
un libnvidia-ml1 (no description available)
un nvidia-384 (no description available)
un nvidia-390 (no description available)
un nvidia-compute-utils (no description available)
ii nvidia-compute-utils-470 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA compute utilities
ii nvidia-container-runtime 3.5.0-1 amd64 NVIDIA container runtime
un nvidia-container-runtime-hook (no description available)
ii nvidia-container-toolkit 1.5.1-1 amd64 NVIDIA container runtime hook
ii nvidia-dkms-470 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA DKMS package
un nvidia-dkms-kernel (no description available)
un nvidia-docker (no description available)
ii nvidia-docker2 2.6.0-1 all nvidia-docker CLI wrapper
ii nvidia-driver-470 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA driver metapackage
un nvidia-driver-binary (no description available)
un nvidia-kernel-common (no description available)
ii nvidia-kernel-common-470 470.63.01-0ubuntu0.21.04.2 amd64 Shared files used with the kernel module
un nvidia-kernel-source (no description available)
ii nvidia-kernel-source-470 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA kernel source package
un nvidia-opencl-icd (no description available)
un nvidia-persistenced (no description available)
un nvidia-prime (no description available)
un nvidia-settings (no description available)
un nvidia-smi (no description available)
un nvidia-utils (no description available)
ii nvidia-utils-470 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA driver support binaries
ii xserver-xorg-video-nvidia-470 470.63.01-0ubuntu0.21.04.2 amd64 NVIDIA binary Xorg driver
version: 1.5.0 build date: 2021-09-02T08:39+00:00 build revision: 4699c1b8b4991b6d869ea403e109291653bb040b build compiler: x86_64-linux-gnu-gcc-7 7.5.0 build platform: x86_64 build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
cat /var/log/nvidia-container-toolkit.log
-- WARNING, the following logs are for debugging purposes only --
I0913 20:33:29.853991 1004 nvc.c:372] initializing library context (version=1.5.0, build=4699c1b8b4991b6d869ea403e109291653bb040b) I0913 20:33:29.854198 1004 nvc.c:346] using root / I0913 20:33:29.854244 1004 nvc.c:347] using ldcache /etc/ld.so.cache I0913 20:33:29.854281 1004 nvc.c:348] using unprivileged user 65534:65534 I0913 20:33:29.854338 1004 nvc.c:389] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL) I0913 20:33:29.854676 1004 nvc.c:391] dxcore initialization failed, continuing assuming a non-WSL environment W0913 20:33:29.854752 1004 nvc.c:249] skipping kernel modules load due to user namespace I0913 20:33:29.854976 1010 driver.c:101] starting driver service I0913 20:33:29.863469 1004 nvc_container.c:388] configuring container with 'compute utility supervised' I0913 20:33:29.863950 1004 nvc_container.c:236] selecting /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/local/cuda-11.4/compat/libcuda.so.470.57.02 I0913 20:33:29.864131 1004 nvc_container.c:236] selecting /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/local/cuda-11.4/compat/libnvidia-ptxjitcompiler.so.470.57.02 I0913 20:33:29.864518 1004 nvc_container.c:408] setting pid to 998 I0913 20:33:29.864566 1004 nvc_container.c:409] setting rootfs to /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0 I0913 20:33:29.864619 1004 nvc_container.c:410] setting owner to 0:0 I0913 20:33:29.864656 1004 nvc_container.c:411] setting bins directory to /usr/bin I0913 20:33:29.864693 1004 nvc_container.c:412] setting libs directory to /usr/lib/x86_64-linux-gnu I0913 20:33:29.864728 1004 nvc_container.c:413] setting libs32 directory to /usr/lib/i386-linux-gnu I0913 20:33:29.864764 1004 nvc_container.c:414] setting cudart directory to /usr/local/cuda I0913 20:33:29.864800 1004 nvc_container.c:415] setting ldconfig to @/sbin/ldconfig.real (host relative) I0913 20:33:29.864847 1004 nvc_container.c:416] setting mount namespace to /proc/998/ns/mnt I0913 20:33:29.864883 1004 nvc_container.c:418] setting devices cgroup to /sys/fs/cgroup/devices/docker/dd7f4ee43c878e6ce63ccaba0c9b9a10d2834add60afb23ae14db0d2f90fb694 I0913 20:33:29.864928 1004 nvc_info.c:750] requesting driver information with '' I0913 20:33:29.866900 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.470.63.01 I0913 20:33:29.867044 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.470.63.01 I0913 20:33:29.867174 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.470.63.01 I0913 20:33:29.867286 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.470.63.01 I0913 20:33:29.867431 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.470.63.01 I0913 20:33:29.867572 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.470.63.01 I0913 20:33:29.867698 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.470.63.01 I0913 20:33:29.867809 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.470.63.01 I0913 20:33:29.867957 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.470.63.01 I0913 20:33:29.868097 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.470.63.01 I0913 20:33:29.868202 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.470.63.01 I0913 20:33:29.868324 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.470.63.01 I0913 20:33:29.868435 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.470.63.01 I0913 20:33:29.868578 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.470.63.01 I0913 20:33:29.868720 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.470.63.01 I0913 20:33:29.868826 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.470.63.01 I0913 20:33:29.868953 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.470.63.01 I0913 20:33:29.869096 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cbl.so.470.63.01 I0913 20:33:29.869204 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.470.63.01 I0913 20:33:29.869348 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.470.63.01 I0913 20:33:29.869595 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.470.63.01 I0913 20:33:29.869810 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.470.63.01 I0913 20:33:29.869925 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.470.63.01 I0913 20:33:29.870033 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.470.63.01 I0913 20:33:29.870168 1004 nvc_info.c:171] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.470.63.01 W0913 20:33:29.870258 1004 nvc_info.c:392] missing library libnvidia-nscq.so W0913 20:33:29.870301 1004 nvc_info.c:392] missing library libnvidia-fatbinaryloader.so W0913 20:33:29.870337 1004 nvc_info.c:392] missing library libvdpau_nvidia.so W0913 20:33:29.870373 1004 nvc_info.c:396] missing compat32 library libnvidia-ml.so W0913 20:33:29.870409 1004 nvc_info.c:396] missing compat32 library libnvidia-cfg.so W0913 20:33:29.870444 1004 nvc_info.c:396] missing compat32 library libnvidia-nscq.so W0913 20:33:29.870494 1004 nvc_info.c:396] missing compat32 library libcuda.so W0913 20:33:29.870531 1004 nvc_info.c:396] missing compat32 library libnvidia-opencl.so W0913 20:33:29.870567 1004 nvc_info.c:396] missing compat32 library libnvidia-ptxjitcompiler.so W0913 20:33:29.870602 1004 nvc_info.c:396] missing compat32 library libnvidia-fatbinaryloader.so W0913 20:33:29.870638 1004 nvc_info.c:396] missing compat32 library libnvidia-allocator.so W0913 20:33:29.870673 1004 nvc_info.c:396] missing compat32 library libnvidia-compiler.so W0913 20:33:29.870722 1004 nvc_info.c:396] missing compat32 library libnvidia-ngx.so W0913 20:33:29.870758 1004 nvc_info.c:396] missing compat32 library libvdpau_nvidia.so W0913 20:33:29.870795 1004 nvc_info.c:396] missing compat32 library libnvidia-encode.so W0913 20:33:29.870830 1004 nvc_info.c:396] missing compat32 library libnvidia-opticalflow.so W0913 20:33:29.870866 1004 nvc_info.c:396] missing compat32 library libnvcuvid.so W0913 20:33:29.870902 1004 nvc_info.c:396] missing compat32 library libnvidia-eglcore.so W0913 20:33:29.870949 1004 nvc_info.c:396] missing compat32 library libnvidia-glcore.so W0913 20:33:29.870985 1004 nvc_info.c:396] missing compat32 library libnvidia-tls.so W0913 20:33:29.871021 1004 nvc_info.c:396] missing compat32 library libnvidia-glsi.so W0913 20:33:29.871057 1004 nvc_info.c:396] missing compat32 library libnvidia-fbc.so W0913 20:33:29.871092 1004 nvc_info.c:396] missing compat32 library libnvidia-ifr.so W0913 20:33:29.871128 1004 nvc_info.c:396] missing compat32 library libnvidia-rtcore.so W0913 20:33:29.871176 1004 nvc_info.c:396] missing compat32 library libnvoptix.so W0913 20:33:29.871213 1004 nvc_info.c:396] missing compat32 library libGLX_nvidia.so W0913 20:33:29.871248 1004 nvc_info.c:396] missing compat32 library libEGL_nvidia.so W0913 20:33:29.871283 1004 nvc_info.c:396] missing compat32 library libGLESv2_nvidia.so W0913 20:33:29.871319 1004 nvc_info.c:396] missing compat32 library libGLESv1_CM_nvidia.so W0913 20:33:29.871354 1004 nvc_info.c:396] missing compat32 library libnvidia-glvkspirv.so W0913 20:33:29.871402 1004 nvc_info.c:396] missing compat32 library libnvidia-cbl.so I0913 20:33:29.871985 1004 nvc_info.c:297] selecting /usr/bin/nvidia-smi I0913 20:33:29.872089 1004 nvc_info.c:297] selecting /usr/bin/nvidia-debugdump I0913 20:33:29.872170 1004 nvc_info.c:297] selecting /usr/bin/nvidia-persistenced I0913 20:33:29.872269 1004 nvc_info.c:297] selecting /usr/bin/nvidia-cuda-mps-control I0913 20:33:29.872342 1004 nvc_info.c:297] selecting /usr/bin/nvidia-cuda-mps-server W0913 20:33:29.872652 1004 nvc_info.c:418] missing binary nv-fabricmanager I0913 20:33:29.872750 1004 nvc_info.c:512] listing device /dev/nvidiactl I0913 20:33:29.872793 1004 nvc_info.c:512] listing device /dev/nvidia-uvm I0913 20:33:29.872829 1004 nvc_info.c:512] listing device /dev/nvidia-uvm-tools I0913 20:33:29.872865 1004 nvc_info.c:512] listing device /dev/nvidia-modeset W0913 20:33:29.872945 1004 nvc_info.c:342] missing ipc /var/run/nvidia-persistenced/socket W0913 20:33:29.873024 1004 nvc_info.c:342] missing ipc /var/run/nvidia-fabricmanager/socket W0913 20:33:29.873107 1004 nvc_info.c:342] missing ipc /tmp/nvidia-mps I0913 20:33:29.873149 1004 nvc_info.c:805] requesting device information with '' I0913 20:33:29.880072 1004 nvc_info.c:697] listing device /dev/nvidia0 (GPU-06986d8e-47c3-467c-c6bc-0a30ae3fbd30 at 00000000:01:00.0) I0913 20:33:29.880343 1004 nvc_mount.c:344] mounting tmpfs at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/proc/driver/nvidia I0913 20:33:29.881942 1004 nvc_mount.c:112] mounting /usr/bin/nvidia-smi at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/bin/nvidia-smi I0913 20:33:29.882284 1004 nvc_mount.c:112] mounting /usr/bin/nvidia-debugdump at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/bin/nvidia-debugdump I0913 20:33:29.882570 1004 nvc_mount.c:112] mounting /usr/bin/nvidia-persistenced at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/bin/nvidia-persistenced I0913 20:33:29.882896 1004 nvc_mount.c:112] mounting /usr/bin/nvidia-cuda-mps-control at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/bin/nvidia-cuda-mps-control I0913 20:33:29.883229 1004 nvc_mount.c:112] mounting /usr/bin/nvidia-cuda-mps-server at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/bin/nvidia-cuda-mps-server I0913 20:33:29.883827 1004 nvc_mount.c:112] mounting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.470.63.01 at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.470.63.01 I0913 20:33:29.884117 1004 nvc_mount.c:112] mounting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.470.63.01 at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.470.63.01 I0913 20:33:29.884449 1004 nvc_mount.c:112] mounting /usr/lib/x86_64-linux-gnu/libcuda.so.470.63.01 at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/lib/x86_64-linux-gnu/libcuda.so.470.63.01 I0913 20:33:29.884795 1004 nvc_mount.c:112] mounting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.470.63.01 at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.470.63.01 I0913 20:33:29.885117 1004 nvc_mount.c:112] mounting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.470.63.01 at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.470.63.01 I0913 20:33:29.885396 1004 nvc_mount.c:112] mounting /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.470.63.01 at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.470.63.01 I0913 20:33:29.885710 1004 nvc_mount.c:112] mounting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.470.63.01 at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.470.63.01 I0913 20:33:29.885866 1004 nvc_mount.c:524] creating symlink /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/lib/x86_64-linux-gnu/libcuda.so -> libcuda.so.1 I0913 20:33:29.886609 1004 nvc_mount.c:63] mounting /lib/firmware/nvidia/470.63.01 at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/usr/lib/firmware/nvidia/470.63.01 I0913 20:33:29.886913 1004 nvc_mount.c:208] mounting /dev/nvidiactl at /var/lib/docker/btrfs/subvolumes/be11006c908fb293162fe6b4ded3bdacc0858a9f4f82a98372c000d5e769f6e0/dev/nvidiactl I0913 20:33:29.887090 1004 nvc_mount.c:499] whitelisting device node 195:255 I0913 20:33:29.889227 1004 nvc.c:423] shutting down library context I0913 20:33:29.890167 1010 driver.c:163] terminating driver service I0913 20:33:29.891254 1004 driver.c:203] driver service terminated successfully
Mon Sep 13 20:41:24 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.63.01 Driver Version: 470.63.01 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A | | 16% 27C P8 16W / 250W | 178MiB / 11175MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| +-----------------------------------------------------------------------------+
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: write error: /sys/fs/cgroup/devices/docker/9e199f6f3e7e69766ce196d617b7e623f506c186b371fd732250ef8d1f1f0631/devices.allow: operation not permitted: unknown.```