Open revanthsenthil opened 2 years ago
Having the same issue on Leap 15.3 Kernel: 5.19.8-lp153.2.g0330383-default
Error is returned after updating: libnvidia-container1 1.10.0-1 -> 1.11.0-1 libnvidia-container-tools 1.10.0-1 -> 1.11.0-1 nvidia-container-toolkit 1.10.0-1 -> 1.11.0-1
nvidia-docker2 version: 2.11.0-1 Docker version 20.10.17-ce, build a89b84221c85
@riddlecp there seems to be an issue with the v1.11.0 package that means that upgrading from 1.10.0 to 1.11.0 may not work as expected. Could you try to remove nvidia-container-toolkit
entirely and reinstall the v1.11.0
version?
See https://github.com/NVIDIA/nvidia-docker/issues/1682#issuecomment-1250952249 for more context.
Thanks Elezar, I saw that thread this morning and was attempting when you replied. Removing the nvidia container toolkit and installing back fixed the issue. I did notice that it attempts to uninstall nvidia-docker2 as part of the removal, so I just reinstalled it as well.
1. Issues
Issue 1:
The above error is prompted when I run
docker compose up
on the docker containers pulled from the following repository - https://github.com/jgoppert/auav_f22Issue 2:
An error that also happens when this error is not prompted is:
This also is from running the same command -
docker compose up
for the same containers2. Steps to reproduce the issue
The instructions to setup the docker containers as in the repo linked above were followed, but some notable steps include using
aptitude
instead ofapt
to make sure dependencies were installed as necessary, as previously, I had to use Synaptic to try and find dependencies that had to be installed/removed for the required nvidia drivers.I am running Ubuntu 22.04 and as indicated below, an x86 system, so the
:i386
should technically not be installed but they do exist.3. Information to attach
[ ] Some nvidia-container information:
nvidia-container-cli -k -d /dev/tty info
I0907 19:23:08.499383 476595 nvc.c:376] initializing library context (version=1.10.0, build=395fd41701117121f1fd04ada01e1d7e006a37ae) I0907 19:23:08.499523 476595 nvc.c:350] using root / I0907 19:23:08.499547 476595 nvc.c:351] using ldcache /etc/ld.so.cache I0907 19:23:08.499572 476595 nvc.c:352] using unprivileged user 1000:1000 I0907 19:23:08.499634 476595 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL) I0907 19:23:08.500224 476595 nvc.c:395] dxcore initialization failed, continuing assuming a non-WSL environment W0907 19:23:12.926908 476604 nvc.c:273] failed to set inheritable capabilities W0907 19:23:12.926942 476604 nvc.c:274] skipping kernel modules load due to failure I0907 19:23:12.927351 476605 rpc.c:71] starting driver rpc service I0907 19:23:12.933072 476606 rpc.c:71] starting nvcgo rpc service I0907 19:23:12.933754 476595 nvc_info.c:766] requesting driver information with '' I0907 19:23:12.935676 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.515.65.01 I0907 19:23:12.935798 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.515.65.01 I0907 19:23:12.935876 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.515.65.01 I0907 19:23:12.935955 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.515.65.01 I0907 19:23:12.936044 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.515.65.01 I0907 19:23:12.936106 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.515.65.01 I0907 19:23:12.936160 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.515.65.01 I0907 19:23:12.936218 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.515.65.01 I0907 19:23:12.936294 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.515.65.01 I0907 19:23:12.936347 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.515.65.01 I0907 19:23:12.936417 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.515.65.01 I0907 19:23:12.936526 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.515.65.01 I0907 19:23:12.936791 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.515.65.01 I0907 19:23:12.937040 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.515.65.01 I0907 19:23:12.937135 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.515.65.01 I0907 19:23:12.937257 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.515.65.01 I0907 19:23:12.937322 476595 nvc_info.c:173] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.515.65.01 W0907 19:23:12.937418 476595 nvc_info.c:399] missing library libnvidia-cfg.so W0907 19:23:12.937427 476595 nvc_info.c:399] missing library libnvidia-nscq.so W0907 19:23:12.937432 476595 nvc_info.c:399] missing library libcudadebugger.so W0907 19:23:12.937437 476595 nvc_info.c:399] missing library libnvidia-fatbinaryloader.so W0907 19:23:12.937444 476595 nvc_info.c:399] missing library libnvidia-allocator.so W0907 19:23:12.937451 476595 nvc_info.c:399] missing library libnvidia-pkcs11.so W0907 19:23:12.937459 476595 nvc_info.c:399] missing library libvdpau_nvidia.so W0907 19:23:12.937466 476595 nvc_info.c:399] missing library libnvidia-encode.so W0907 19:23:12.937473 476595 nvc_info.c:399] missing library libnvidia-opticalflow.so W0907 19:23:12.937480 476595 nvc_info.c:399] missing library libnvcuvid.so W0907 19:23:12.937487 476595 nvc_info.c:399] missing library libnvidia-fbc.so W0907 19:23:12.937494 476595 nvc_info.c:399] missing library libnvidia-ifr.so W0907 19:23:12.937501 476595 nvc_info.c:399] missing library libnvidia-cbl.so W0907 19:23:12.937520 476595 nvc_info.c:403] missing compat32 library libnvidia-ml.so W0907 19:23:12.937546 476595 nvc_info.c:403] missing compat32 library libnvidia-cfg.so W0907 19:23:12.937552 476595 nvc_info.c:403] missing compat32 library libnvidia-nscq.so W0907 19:23:12.937558 476595 nvc_info.c:403] missing compat32 library libcuda.so W0907 19:23:12.937564 476595 nvc_info.c:403] missing compat32 library libcudadebugger.so W0907 19:23:12.937572 476595 nvc_info.c:403] missing compat32 library libnvidia-opencl.so W0907 19:23:12.937579 476595 nvc_info.c:403] missing compat32 library libnvidia-ptxjitcompiler.so W0907 19:23:12.937585 476595 nvc_info.c:403] missing compat32 library libnvidia-fatbinaryloader.so W0907 19:23:12.937591 476595 nvc_info.c:403] missing compat32 library libnvidia-allocator.so W0907 19:23:12.937599 476595 nvc_info.c:403] missing compat32 library libnvidia-compiler.so W0907 19:23:12.937605 476595 nvc_info.c:403] missing compat32 library libnvidia-pkcs11.so W0907 19:23:12.937612 476595 nvc_info.c:403] missing compat32 library libnvidia-ngx.so W0907 19:23:12.937619 476595 nvc_info.c:403] missing compat32 library libvdpau_nvidia.so W0907 19:23:12.937626 476595 nvc_info.c:403] missing compat32 library libnvidia-encode.so W0907 19:23:12.937633 476595 nvc_info.c:403] missing compat32 library libnvidia-opticalflow.so W0907 19:23:12.937640 476595 nvc_info.c:403] missing compat32 library libnvcuvid.so W0907 19:23:12.937647 476595 nvc_info.c:403] missing compat32 library libnvidia-eglcore.so W0907 19:23:12.937654 476595 nvc_info.c:403] missing compat32 library libnvidia-glcore.so W0907 19:23:12.937661 476595 nvc_info.c:403] missing compat32 library libnvidia-tls.so W0907 19:23:12.937668 476595 nvc_info.c:403] missing compat32 library libnvidia-glsi.so W0907 19:23:12.937676 476595 nvc_info.c:403] missing compat32 library libnvidia-fbc.so W0907 19:23:12.937683 476595 nvc_info.c:403] missing compat32 library libnvidia-ifr.so W0907 19:23:12.937690 476595 nvc_info.c:403] missing compat32 library libnvidia-rtcore.so W0907 19:23:12.937697 476595 nvc_info.c:403] missing compat32 library libnvoptix.so W0907 19:23:12.937704 476595 nvc_info.c:403] missing compat32 library libGLX_nvidia.so W0907 19:23:12.937711 476595 nvc_info.c:403] missing compat32 library libEGL_nvidia.so W0907 19:23:12.937718 476595 nvc_info.c:403] missing compat32 library libGLESv2_nvidia.so W0907 19:23:12.937725 476595 nvc_info.c:403] missing compat32 library libGLESv1_CM_nvidia.so W0907 19:23:12.937732 476595 nvc_info.c:403] missing compat32 library libnvidia-glvkspirv.so W0907 19:23:12.937739 476595 nvc_info.c:403] missing compat32 library libnvidia-cbl.so I0907 19:23:12.938079 476595 nvc_info.c:299] selecting /usr/bin/nvidia-smi I0907 19:23:12.938120 476595 nvc_info.c:299] selecting /usr/bin/nvidia-debugdump W0907 19:23:12.938590 476595 nvc_info.c:425] missing binary nvidia-persistenced W0907 19:23:12.938596 476595 nvc_info.c:425] missing binary nv-fabricmanager W0907 19:23:12.938600 476595 nvc_info.c:425] missing binary nvidia-cuda-mps-control W0907 19:23:12.938604 476595 nvc_info.c:425] missing binary nvidia-cuda-mps-server W0907 19:23:12.938653 476595 nvc_info.c:349] missing firmware path /lib/firmware/nvidia/515.65.01/gsp.bin I0907 19:23:12.938696 476595 nvc_info.c:529] listing device /dev/nvidiactl I0907 19:23:12.938700 476595 nvc_info.c:529] listing device /dev/nvidia-uvm I0907 19:23:12.938705 476595 nvc_info.c:529] listing device /dev/nvidia-uvm-tools I0907 19:23:12.938709 476595 nvc_info.c:529] listing device /dev/nvidia-modeset W0907 19:23:12.938768 476595 nvc_info.c:349] missing ipc path /var/run/nvidia-persistenced/socket W0907 19:23:12.938811 476595 nvc_info.c:349] missing ipc path /var/run/nvidia-fabricmanager/socket W0907 19:23:12.938829 476595 nvc_info.c:349] missing ipc path /tmp/nvidia-mps I0907 19:23:12.938834 476595 nvc_info.c:822] requesting device information with '' I0907 19:23:12.947131 476595 nvc_info.c:713] listing device /dev/nvidia0 (GPU-21e3065c-8a1a-6cf7-b0fd-41d6c51f726e at 00000000:01:00.0) NVRM version: 515.65.01 CUDA version: 11.7
Device Index: 0 Device Minor: 0 Model: NVIDIA GeForce GTX 1650 Brand: GeForce GPU UUID: GPU-21e3065c-8a1a-6cf7-b0fd-41d6c51f726e Bus Location: 00000000:01:00.0 Architecture: 7.5 I0907 19:23:12.947333 476595 nvc.c:434] shutting down library context I0907 19:23:12.947490 476606 rpc.c:95] terminating nvcgo rpc service I0907 19:23:12.949168 476595 rpc.c:135] nvcgo rpc service terminated successfully I0907 19:23:12.954386 476605 rpc.c:95] terminating driver rpc service I0907 19:23:12.954810 476595 rpc.c:135] driver rpc service terminated successfully
Linux revanth-XPS-15-7590 5.15.0-47-generic NVIDIA/nvidia-docker#51-Ubuntu SMP Thu Aug 11 07:51:15 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
==============NVSMI LOG==============
Timestamp : Wed Sep 7 15:26:42 2022 Driver Version : 515.65.01 CUDA Version : 11.7
Attached GPUs : 1 GPU 00000000:01:00.0 Product Name : NVIDIA GeForce GTX 1650 Product Brand : GeForce Product Architecture : Turing Display Mode : Disabled Display Active : Disabled Persistence Mode : Disabled MIG Mode Current : N/A Pending : N/A Accounting Mode : Disabled Accounting Mode Buffer Size : 4000 Driver Model Current : N/A Pending : N/A Serial Number : N/A GPU UUID : GPU-21e3065c-8a1a-6cf7-b0fd-41d6c51f726e Minor Number : 0 VBIOS Version : 90.17.1C.40.4B MultiGPU Board : No Board ID : 0x100 GPU Part Number : N/A Module ID : 0 Inforom Version Image Version : G001.0000.02.04 OEM Object : 1.1 ECC Object : N/A Power Management Object : N/A GPU Operation Mode Current : N/A Pending : N/A GSP Firmware Version : N/A GPU Virtualization Mode Virtualization Mode : None Host VGPU Mode : N/A IBMNPU Relaxed Ordering Mode : N/A PCI Bus : 0x01 Device : 0x00 Domain : 0x0000 Device Id : 0x1F9110DE Bus Id : 00000000:01:00.0 Sub System Id : 0x8601103C GPU Link Info PCIe Generation Max : 3 Current : 3 Link Width Max : 16x Current : 16x Bridge Chip Type : N/A Firmware : N/A Replays Since Reset : 0 Replay Number Rollovers : 0 Tx Throughput : 0 KB/s Rx Throughput : 0 KB/s Fan Speed : N/A Performance State : P3 Clocks Throttle Reasons Idle : Not Active Applications Clocks Setting : Not Active SW Power Cap : Active HW Slowdown : Not Active HW Thermal Slowdown : Not Active HW Power Brake Slowdown : Not Active Sync Boost : Not Active SW Thermal Slowdown : Active Display Clock Setting : Not Active FB Memory Usage Total : 4096 MiB Reserved : 181 MiB Used : 10 MiB Free : 3904 MiB BAR1 Memory Usage Total : 256 MiB Used : 3 MiB Free : 253 MiB Compute Mode : Default Utilization Gpu : 0 % Memory : 0 % Encoder : 0 % Decoder : 0 % Encoder Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 FBC Stats Active Sessions : 0 Average FPS : 0 Average Latency : 0 Ecc Mode Current : N/A Pending : N/A ECC Errors Volatile SRAM Correctable : N/A SRAM Uncorrectable : N/A DRAM Correctable : N/A DRAM Uncorrectable : N/A Aggregate SRAM Correctable : N/A SRAM Uncorrectable : N/A DRAM Correctable : N/A DRAM Uncorrectable : N/A Retired Pages Single Bit ECC : N/A Double Bit ECC : N/A Pending Page Blacklist : N/A Remapped Rows : N/A Temperature GPU Current Temp : 43 C GPU Shutdown Temp : 102 C GPU Slowdown Temp : 97 C GPU Max Operating Temp : 75 C GPU Target Temperature : N/A Memory Current Temp : N/A Memory Max Operating Temp : N/A Power Readings Power Management : N/A Power Draw : 11.96 W Power Limit : N/A Default Power Limit : N/A Enforced Power Limit : N/A Min Power Limit : N/A Max Power Limit : N/A Clocks Graphics : 1395 MHz SM : 1395 MHz Memory : 3500 MHz Video : 1290 MHz Applications Clocks Graphics : N/A Memory : N/A Default Applications Clocks Graphics : N/A Memory : N/A Max Clocks Graphics : 2100 MHz SM : 2100 MHz Memory : 4001 MHz Video : 1950 MHz Max Customer Boost Clocks Graphics : N/A Clock Policy Auto Boost : N/A Auto Boost Default : N/A Voltage Graphics : N/A Processes GPU instance ID : N/A Compute instance ID : N/A Process ID : 3541 Type : G Name : /usr/lib/xorg/Xorg Used GPU Memory : 4 MiB GPU instance ID : N/A Compute instance ID : N/A Process ID : 468410 Type : C+G Name : /opt/google/chrome/chrome --type=gpu-process --enable-crashpad --crashpad-handler-pid=5316 --enable-crash-reporter=e3964464-0402-4a0b-9245-6fd60cb8f256, --change-stack-guard-on-fork=enable --gpu-preferences=WAAAAAAAAAAgAAAIAAAAAAAAAAAAAAAAAABgAAEAAAA4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAIAAAAAAAAAABAAAAAAAAAAgAAAAAAAAACAAAAAAAAAAIAAAAAAAAAA== --shared-files --field-trial-handle=0,i,16466788088194899339,13002346894531801217,131072 Used GPU Memory : 4 MiB
20.10.17 Client: Docker Engine - Community Cloud integration: v1.0.28 Version: 20.10.17 API version: 1.41 Go version: go1.17.11 Git commit: 100c701 Built: Mon Jun 6 23:02:46 2022 OS/Arch: linux/amd64 Context: desktop-linux Experimental: true
Server: Docker Desktop 4.11.0 (83626) Engine: Version: 20.10.17 API version: 1.41 (minimum version 1.12) Go version: go1.17.11 Git commit: a89b842 Built: Mon Jun 6 23:01:23 2022 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.6.6 GitCommit: **** runc: Version: 1.1.2 GitCommit: v1.1.2-0-ga916309 docker-init: Version: 0.19.0 GitCommit: de40ad0
Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-===================================-==========================-============-====================> un libgldispatch0-nvidia (no description avai>
un libnvidia-common (no description avai>
ii libnvidia-common-515-server 515.65.01-0ubuntu0.22.04.1 all Shared files used by>
un libnvidia-compute (no description avai>
ii libnvidia-compute-515:amd64 515.65.01-0ubuntu0.22.04.1 amd64 NVIDIA libcompute pa>
ii libnvidia-container-tools 1.10.0-1 amd64 NVIDIA container run>
ii libnvidia-container1:amd64 1.10.0-1 amd64 NVIDIA container run>
ii libnvidia-egl-wayland1:amd64 1:1.1.9-1.1 amd64 Wayland EGL External>
un libnvidia-encode1 (no description avai>
un libnvidia-gl (no description avai>
un libnvidia-gl-390 (no description avai>
un libnvidia-gl-410 (no description avai>
ii libnvidia-gl-515-server:amd64 515.65.01-0ubuntu0.22.04.1 amd64 NVIDIA OpenGL/GLX/EG>
un libnvidia-legacy-390xx-egl-wayland1 (no description avai>
un libnvidia-ml1 (no description avai>
un nvidia-384 (no description avai>
un nvidia-390 (no description avai>
un nvidia-common (no description avai>
un nvidia-compute-utils (no description avai>
rc nvidia-compute-utils-515 515.65.01-0ubuntu0.22.04.1 amd64 NVIDIA compute utili>
un nvidia-container-runtime (no description avai>
un nvidia-container-runtime-hook (no description avai>
ii nvidia-container-toolkit 1.10.0-1 amd64 NVIDIA container run>
rc nvidia-dkms-515 515.65.01-0ubuntu0.22.04.1 amd64 NVIDIA DKMS package
un nvidia-dkms-kernel (no description avai>
un nvidia-docker (no description avai>
ii nvidia-docker2 2.11.0-1 all nvidia-docker CLI wr>
un nvidia-driver-515 (no description avai>
un nvidia-egl-wayland-common (no description avai>
un nvidia-kernel-common (no description avai>
rc nvidia-kernel-common-515 515.65.01-0ubuntu0.22.04.1 amd64 Shared files used wi>
un nvidia-kernel-source-515 (no description avai>
un nvidia-libopencl1-dev (no description avai>
un nvidia-opencl-icd (no description avai>
un nvidia-persistenced (no description avai>
rc nvidia-prime 0.8.17.1 all Tools to enable NVID>
ii nvidia-settings 510.47.03-0ubuntu1 amd64 Tool for configuring>
un nvidia-settings-binary (no description avai>
un nvidia-smi (no description avai>
un nvidia-utils (no description avai>
ii nvidia-utils-515 515.65.01-0ubuntu0.22.04.1 amd64 NVIDIA driver suppor>
cli-version: 1.10.0 lib-version: 1.10.0 build date: 2022-06-13T10:39+00:00 build revision: 395fd41701117121f1fd04ada01e1d7e006a37ae build compiler: x86_64-linux-gnu-gcc-7 7.5.0 build platform: x86_64 build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
+] Running 3/0 ⠿ Container px4 Created 0.0s ⠿ Container gcs Created 0.0s ⠿ Container onboard Created 0.0s Attaching to gcs, onboard, px4, sim px4 | Unable to init server: Could not connect: Connection refused px4 | Unable to init server: Could not connect: Connection refused px4 | You need to run terminator in an X environment. Make sure $DISPLAY is properly set Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown