NVIDIA / nvidia-container-toolkit

Build and run containers leveraging NVIDIA GPUs
Apache License 2.0
2.29k stars 245 forks source link

Applications not using GPU inside docker container #406

Open kmr1wz opened 6 months ago

kmr1wz commented 6 months ago

Hi everyone, I've been trying to make my GPU be utilized when using any graphical application from inside of the docker container, but with no success so far. Posting this as an issue since I've followed the instructions here to the letter but still failed to make progress.

My setup:

Ubuntu on the host: 20.04.6 LTS
Nvidia driver version: 525.147.05
Cuda version: 12.0
kernel version: 5.15.0-97-generic
nvidia-container-toolskit version: 1.15.0-rc.3
docker version: 24.0.4
nvidia-docker2 version: 2.14.0-1

I've been trying to run the container as follows: docker run -it --rm --privileged -e DISPLAY=$DISPLAY --runtime=nvidia --gpus all -v /tmp/.X11-unix:/tmp/.X11-unix nvidia/cuda:11.6.2-base-ubuntu20.04 bash

after which, when I run nvidia-smi I do get the expected output:

Tue Mar 12 15:10:54 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A300...  Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   53C    P8    13W /  80W |    552MiB /  6144MiB |     17%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

However, when I install glmark2 and run it it does not utilize the GPU at all. Moreover, after installing nvidia-settings and nvidia-prime I do not see an option to switch nvidia prime to performace mode in the nvidia settings (I should not that after switching that on my host the GPU started being utilized).

Anyone has any ideas on what is going on and what I might be doing wrong? I'd appreciate any help, running out of ideas here.

Thanks in advance, Michal

Queequeg92 commented 6 months ago

@kmr1wz same issue. Have you found the solution? https://github.com/NVIDIA/nvidia-container-toolkit/issues/426

kmr1wz commented 6 months ago

Unfortunately no fix yet - will post here if I figure anything out. Would appreciate anyone else having any advice too.

elezar commented 6 months ago

Note that the graphics or display libraries are only injected if NVIDIA_DRIVER_CAPABILIES include display and / or graphics. Could you try run the container with -e NVIDIA_DRIVER_CAPABILITIES=all?

kmr1wz commented 6 months ago

@elezar Wow, that was literally it. Now glmark2 is using the driver from within the container. Thanks so much man!

For reference, full command I'm running (with success):

docker run -it --rm     --privileged     -e DISPLAY=$DISPLAY -e NVIDIA_DRIVER_CAPABILITIES=all --runtime=nvidia --gpus all -v /tmp/.X11-unix:/tmp/.X11-unix nvidia/cuda:11.6.2-base-ubuntu20.04 bash