mviereck / x11docker

Run GUI applications and desktops in docker and podman containers. Focus on security.
MIT License
5.62k stars 378 forks source link

Support for Docker's --gpus option #180

Closed schra closed 5 years ago

schra commented 5 years ago

Docker 19.03 introduced the --gpus option (docker/cli#1714), which x11docker could make use of. Currently you have to pass the --gpus option manually:

$ x11docker --gpu --quiet -- nvidia/cuda:9.0-base nvidia-smi
[FATAL tini (97)] exec nvidia-smi failed: No such file or directory

$ x11docker --quiet -- --gpus all -- nvidia/cuda:9.0-base nvidia-smi
Wed Aug 21 14:52:16 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.40       Driver Version: 430.40       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 980 Ti  Off  | 00000000:04:00.0  On |                  N/A |
|  0%   52C    P8    19W / 250W |    849MiB /  6080MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Would it be possible to extend the --gpu option to check if the --gpus option works and then use it or otherwise fallback to the current solution?

I'm btw not really sure why nvidia-smi isn't found when using x11docker's --gpu option, but somehow it's missing:

$ diff <(x11docker --gpu --quiet -- nvidia/cuda:9.0-base ls /usr/bin) \
       <(x11docker --quiet -- --gpus all -- nvidia/cuda:9.0-base ls /usr/bin)
rm: cannot remove '/tmp/x11docker_parsererror': No such file or directory
126a127,131
> nvidia-cuda-mps-control
> nvidia-cuda-mps-server
> nvidia-debugdump
> nvidia-persistenced
> nvidia-smi

Also see https://github.com/NVIDIA/nvidia-docker#quickstart

mviereck commented 5 years ago

Thank you for your suggestion! I was not aware of this new docker option.

Would it be possible to extend the --gpu option to check if the --gpus option works and then use it or otherwise fallback to the current solution?

In general it would be possible. There is one quite annoying point: Option --gpus works for NVIDIA hardware only. It looks like an ugly and unethical vendor-lock-in.

So far the new option is worth less than the previous --runtime=nvidia solution.

With my AMD GPU docker fails with:

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

To be a serious option, docker must support other vendors, too.


I'm btw not really sure why nvidia-smi isn't found when using x11docker's --gpu option, but somehow it's missing:

It seems the command nvidia-smi is missing in nvidia/cuda:9.0-base but added by the --gpus all option to the container system:

$ docker run --rm nvidia/cuda:9.0-base nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "exec: \"nvidia-smi\": executable file not found in $PATH": unknown.

To compare with x11docker use image nvidia/cuda instead:

x11docker --gpu --runtime=nvidia -- nvidia/cuda nvidia-smi
mviereck commented 5 years ago

I have opened a new ticket at docker to discuss general GPU support by --gpus: https://github.com/docker/cli/issues/2063

mviereck commented 5 years ago

I don't see progress in the ticket mentioned above. As long as --gpus adds no value to the current solutions, x11docker won't use it.

Current solutions for closed source NVIDIA driver with x11docker are described in the wiki: https://github.com/mviereck/x11docker/wiki/NVIDIA-driver-support-for-docker-container