NVIDIA / nvidia-container-toolkit

Build and run containers leveraging NVIDIA GPUs
Apache License 2.0
2.2k stars 241 forks source link

Docker Desktop - Vulkan Drivers not found #233

Open PowerOfNames opened 1 year ago

PowerOfNames commented 1 year ago

1. Issue or feature description

Trying to run the Docker image (https://hub.docker.com/r/scenerygraphics/nvidia-vulkan/tags)

docker run -it --gpus=all scenerygraphics/nvidia-vulkan:1.3.216.0-ubuntu20.04-updatedmodels bash

-> working and produces container

-> container: nvidia-smi output: +-----------------------------------------------------------------------------+ | NVIDIA-SMI 520.61.03 Driver Version: 522.06 CUDA Version: 11.8 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... On | 00000000:01:00.0 Off | N/A | | N/A 46C P8 13W / N/A | 39MiB / 8192MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 23 G /Xwayland N/A | | 0 N/A N/A 23 G /Xwayland N/A | | 0 N/A N/A 27 G /Xwayland N/A | +-----------------------------------------------------------------------------+

-> vulkaninfo: ERROR: [Loader Message] Code 0 : libGLX_nvidia.so.0: cannot open shared object file: No such file or directory Cannot create Vulkan instance. This problem is often caused by a faulty installation of the Vulkan driver or attempting to use a GPU that does not support Vulkan. /build/vulkan-tools-KEbD_A/vulkan-tools-1.2.131.1+dfsg1/vulkaninfo/vulkaninfo.h:371: failed with ERROR_INCOMPATIBLE_DRIVER

virtualization is active ubuntu 20.04 is installed

2. Steps to reproduce the issue

System: Windows 11 Home Version 22H2 (Build 22621.525) CPU: Intel(r) Core(TM) i7 - 10875H CPU @ 2.3GHz GPU: NVIDIA GeForce 3080 Laptop GPU Docker Version: Docker version 20.10.12, build 20.10.12-0ubuntu4 Ubuntu version: 22.04

3. Information to attach (optional if deemed irrelevant)

-- WARNING, the following logs are for debugging purposes only --

I1121 12:43:59.442591 255 nvc.c:376] initializing library context (version=1.11.0, build=c8f267be0bac1c654d59ad4ea5df907141149977) I1121 12:43:59.442653 255 nvc.c:350] using root / I1121 12:43:59.442656 255 nvc.c:351] using ldcache /etc/ld.so.cache I1121 12:43:59.442658 255 nvc.c:352] using unprivileged user 1000:1001 I1121 12:43:59.442667 255 nvc.c:393] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL) I1121 12:43:59.444186 255 dxcore.c:227] Creating a new WDDM Adapter for hAdapter:40000000 luid:23c99f I1121 12:43:59.447190 255 dxcore.c:268] Adding new adapter via dxcore hAdapter:40000000 luid:23c99f wddm version:3100 I1121 12:43:59.447209 255 dxcore.c:227] Creating a new WDDM Adapter for hAdapter:40000040 luid:23cab6 I1121 12:43:59.448889 255 dxcore.c:210] Core Nvidia component libcuda.so.1.1 not found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_51f685305808e3a5 I1121 12:43:59.449530 255 dxcore.c:210] Core Nvidia component libcuda_loader.so not found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_51f685305808e3a5 I1121 12:43:59.450201 255 dxcore.c:210] Core Nvidia component libnvidia-ptxjitcompiler.so.1 not found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_51f685305808e3a5 I1121 12:43:59.450806 255 dxcore.c:210] Core Nvidia component libnvidia-ml.so.1 not found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_51f685305808e3a5 I1121 12:43:59.451402 255 dxcore.c:210] Core Nvidia component libnvidia-ml_loader.so not found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_51f685305808e3a5 I1121 12:43:59.452033 255 dxcore.c:210] Core Nvidia component nvidia-smi not found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_51f685305808e3a5 I1121 12:43:59.452074 255 dxcore.c:215] No Nvidia component found in /usr/lib/wsl/drivers/iigd_dch.inf_amd64_51f685305808e3a5 E1121 12:43:59.452077 255 dxcore.c:261] Failed to query the core Nvidia libraries for the adapter. Skipping it. I1121 12:43:59.452079 255 dxcore.c:325] dxcore layer initialized successfully W1121 12:43:59.452388 255 nvc.c:405] skipping kernel modules load on WSL I1121 12:43:59.452602 256 rpc.c:71] starting driver rpc service I1121 12:43:59.478826 257 rpc.c:71] starting nvcgo rpc service I1121 12:43:59.483373 255 nvc_info.c:766] requesting driver information with '' I1121 12:43:59.556089 255 nvc_info.c:199] selecting /usr/lib/wsl/lib/libnvidia-opticalflow.so.1 I1121 12:43:59.556154 255 nvc_info.c:199] selecting /usr/lib/wsl/lib/libnvidia-ml.so.1 I1121 12:43:59.556180 255 nvc_info.c:199] selecting /usr/lib/wsl/lib/libnvidia-encode.so.1 I1121 12:43:59.556205 255 nvc_info.c:199] selecting /usr/lib/wsl/lib/libnvcuvid.so.1 I1121 12:43:59.556240 255 nvc_info.c:199] selecting /usr/lib/wsl/lib/libdxcore.so I1121 12:43:59.556268 255 nvc_info.c:199] selecting /usr/lib/wsl/lib/libcuda.so.1 W1121 12:43:59.556374 255 nvc_info.c:399] missing library libnvidia-cfg.so W1121 12:43:59.556395 255 nvc_info.c:399] missing library libnvidia-nscq.so W1121 12:43:59.556398 255 nvc_info.c:399] missing library libcudadebugger.so W1121 12:43:59.556400 255 nvc_info.c:399] missing library libnvidia-opencl.so W1121 12:43:59.556419 255 nvc_info.c:399] missing library libnvidia-ptxjitcompiler.so W1121 12:43:59.556424 255 nvc_info.c:399] missing library libnvidia-fatbinaryloader.so W1121 12:43:59.556426 255 nvc_info.c:399] missing library libnvidia-allocator.so W1121 12:43:59.556444 255 nvc_info.c:399] missing library libnvidia-compiler.so W1121 12:43:59.556447 255 nvc_info.c:399] missing library libnvidia-pkcs11.so W1121 12:43:59.556449 255 nvc_info.c:399] missing library libnvidia-ngx.so W1121 12:43:59.556451 255 nvc_info.c:399] missing library libvdpau_nvidia.so W1121 12:43:59.556453 255 nvc_info.c:399] missing library libnvidia-eglcore.so W1121 12:43:59.556456 255 nvc_info.c:399] missing library libnvidia-glcore.so W1121 12:43:59.556458 255 nvc_info.c:399] missing library libnvidia-tls.so W1121 12:43:59.556460 255 nvc_info.c:399] missing library libnvidia-glsi.so W1121 12:43:59.556462 255 nvc_info.c:399] missing library libnvidia-fbc.so W1121 12:43:59.556493 255 nvc_info.c:399] missing library libnvidia-ifr.so W1121 12:43:59.556532 255 nvc_info.c:399] missing library libnvidia-rtcore.so W1121 12:43:59.556536 255 nvc_info.c:399] missing library libnvoptix.so W1121 12:43:59.556538 255 nvc_info.c:399] missing library libGLX_nvidia.so W1121 12:43:59.556540 255 nvc_info.c:399] missing library libEGL_nvidia.so W1121 12:43:59.556542 255 nvc_info.c:399] missing library libGLESv2_nvidia.so W1121 12:43:59.556543 255 nvc_info.c:399] missing library libGLESv1_CM_nvidia.so W1121 12:43:59.556545 255 nvc_info.c:399] missing library libnvidia-glvkspirv.so W1121 12:43:59.556547 255 nvc_info.c:399] missing library libnvidia-cbl.so W1121 12:43:59.556549 255 nvc_info.c:403] missing compat32 library libnvidia-ml.so W1121 12:43:59.556551 255 nvc_info.c:403] missing compat32 library libnvidia-cfg.so W1121 12:43:59.556554 255 nvc_info.c:403] missing compat32 library libnvidia-nscq.so W1121 12:43:59.556556 255 nvc_info.c:403] missing compat32 library libcuda.so W1121 12:43:59.556558 255 nvc_info.c:403] missing compat32 library libcudadebugger.so W1121 12:43:59.556591 255 nvc_info.c:403] missing compat32 library libnvidia-opencl.so W1121 12:43:59.556594 255 nvc_info.c:403] missing compat32 library libnvidia-ptxjitcompiler.so W1121 12:43:59.556597 255 nvc_info.c:403] missing compat32 library libnvidia-fatbinaryloader.so W1121 12:43:59.556599 255 nvc_info.c:403] missing compat32 library libnvidia-allocator.so W1121 12:43:59.556601 255 nvc_info.c:403] missing compat32 library libnvidia-compiler.so W1121 12:43:59.556602 255 nvc_info.c:403] missing compat32 library libnvidia-pkcs11.so W1121 12:43:59.556604 255 nvc_info.c:403] missing compat32 library libnvidia-ngx.so W1121 12:43:59.556606 255 nvc_info.c:403] missing compat32 library libvdpau_nvidia.so W1121 12:43:59.556608 255 nvc_info.c:403] missing compat32 library libnvidia-encode.so W1121 12:43:59.556610 255 nvc_info.c:403] missing compat32 library libnvidia-opticalflow.so W1121 12:43:59.556612 255 nvc_info.c:403] missing compat32 library libnvcuvid.so W1121 12:43:59.556614 255 nvc_info.c:403] missing compat32 library libnvidia-eglcore.so W1121 12:43:59.556616 255 nvc_info.c:403] missing compat32 library libnvidia-glcore.so W1121 12:43:59.556619 255 nvc_info.c:403] missing compat32 library libnvidia-tls.so W1121 12:43:59.556621 255 nvc_info.c:403] missing compat32 library libnvidia-glsi.so W1121 12:43:59.556624 255 nvc_info.c:403] missing compat32 library libnvidia-fbc.so W1121 12:43:59.556626 255 nvc_info.c:403] missing compat32 library libnvidia-ifr.so W1121 12:43:59.556628 255 nvc_info.c:403] missing compat32 library libnvidia-rtcore.so W1121 12:43:59.556631 255 nvc_info.c:403] missing compat32 library libnvoptix.so W1121 12:43:59.556635 255 nvc_info.c:403] missing compat32 library libGLX_nvidia.so W1121 12:43:59.556636 255 nvc_info.c:403] missing compat32 library libEGL_nvidia.so W1121 12:43:59.556661 255 nvc_info.c:403] missing compat32 library libGLESv2_nvidia.so W1121 12:43:59.556663 255 nvc_info.c:403] missing compat32 library libGLESv1_CM_nvidia.so W1121 12:43:59.556664 255 nvc_info.c:403] missing compat32 library libnvidia-glvkspirv.so W1121 12:43:59.556666 255 nvc_info.c:403] missing compat32 library libnvidia-cbl.so W1121 12:43:59.556668 255 nvc_info.c:403] missing compat32 library libdxcore.so I1121 12:43:59.558231 255 nvc_info.c:279] selecting /usr/lib/wsl/drivers/nvrzui.inf_amd64_678ba5fc2cb5319f/nvidia-smi W1121 12:43:59.940614 255 nvc_info.c:425] missing binary nvidia-debugdump W1121 12:43:59.940664 255 nvc_info.c:425] missing binary nvidia-persistenced W1121 12:43:59.940667 255 nvc_info.c:425] missing binary nv-fabricmanager W1121 12:43:59.940668 255 nvc_info.c:425] missing binary nvidia-cuda-mps-control W1121 12:43:59.940670 255 nvc_info.c:425] missing binary nvidia-cuda-mps-server I1121 12:43:59.940671 255 nvc_info.c:441] skipping path lookup for dxcore I1121 12:43:59.940676 255 nvc_info.c:529] listing device /dev/dxg W1121 12:43:59.940687 255 nvc_info.c:349] missing ipc path /var/run/nvidia-persistenced/socket W1121 12:43:59.940734 255 nvc_info.c:349] missing ipc path /var/run/nvidia-fabricmanager/socket W1121 12:43:59.940774 255 nvc_info.c:349] missing ipc path /tmp/nvidia-mps I1121 12:43:59.940830 255 nvc_info.c:822] requesting device information with '' I1121 12:43:59.953624 255 nvc_info.c:694] listing dxcore adapter 0 (GPU-cb92694d-7fb1-fb5b-c3bc-be4c54f8424e at 00000000:01:00.0) NVRM version: 522.06 CUDA version: 11.8

Device Index: 0 Device Minor: 0 Model: NVIDIA GeForce RTX 3080 Laptop GPU Brand: GeForce GPU UUID: GPU-cb92694d-7fb1-fb5b-c3bc-be4c54f8424e Bus Location: 00000000:01:00.0 Architecture: 8.6 I1121 12:43:59.953694 255 nvc.c:434] shutting down library context I1121 12:43:59.953783 257 rpc.c:95] terminating nvcgo rpc service I1121 12:43:59.954280 255 rpc.c:135] nvcgo rpc service terminated successfully I1121 12:43:59.956440 256 rpc.c:95] terminating driver rpc service I1121 12:43:59.958507 255 rpc.c:135] driver rpc service terminated successfully

Linux PowerOfNames 5.15.74.2-microsoft-standard-WSL2 NVIDIA/nvidia-docker#1 SMP Wed Nov 2 19:50:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

cli-version: 1.11.0 lib-version: 1.11.0 build date: 2022-09-06T09:21+00:00 build revision: c8f267be0bac1c654d59ad4ea5df907141149977 build compiler: x86_64-linux-gnu-gcc-7 7.5.0 build platform: x86_64 build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections

Ask for more information with specific instuction as I'm new to this. Thanks!

SheldonWBM commented 11 months ago

I get a similar error when using Docker Desktop vs. Docker Engine Docker Desktop does not support GPU, at the moment. If you are currently using Docker Desktop (like I am on one of my machines) when you run the container as root (i.e. sudo docker compose up) the GPU is accessible but, it uses the daemon directly bypassing Docker Desktop. Running a docker container as root is not advisable. It is a shame because Docker Desktop has nice extensions etc. and, I prefer having easier control over the resources used. Note: You must have the nvidia-container-runtime installed. Note: To add runtime: nvidia to your docker-compose.yml (or cmd argument)

The configuration for Docker Desktop is not the same as the docker deamon. Therefore, create the file if it does not exist /etc/docker/daemon.json and add the following

{
  "runtimes": {
    "nvidia": {
      "path": "/usr/bin/nvidia-container-runtime",
      "runtimeArgs": []
    }
  }
}
johnhelt commented 9 months ago

I have the same issue. Running container as sudo in wsl2 doesn't help. What gives?