Open ereslibre opened 3 weeks ago
cc/ @chmanie @benxiao
Identified the problem and submitted https://github.com/moby/moby/pull/48541 upstream.
I'll create a PR on NixOS to include this patch in the meantime and gather feedback. I can confirm I am able to use Nvidia GPU's with CDI on rootless mode:
❯ DOCKER_HOST=unix:///run/user/1000/docker.sock docker run --rm --device=nvidia.com/gpu=all -it ubuntu:latest nvidia-smi -L
GPU 0: NVIDIA GeForce RTX 4090 (UUID: GPU-c475e08b-0cc5-f5aa-4326-99699429b449)
GPU 1: NVIDIA GeForce RTX 2080 SUPER (UUID: GPU-5cca1a6f-7cee-b649-40f0-2d3ecb0aa207
Describe the bug
When Docker is run in rootless mode, CDI devices are not exposed to the container.
Steps To Reproduce
Steps to reproduce the behavior:
nvidia-container-toolkit
:nvidia-container-toolkit
in a container:Additional context
Podman works as expected and reads the JSON at
/var/run/cdi/nvidia-container-toolkit.json
generated by thenvidia-container-toolkit
whenhardware.nvidia-container-toolkit.enable = true;
.This issue has two main problems at this time:
virtualisation.docker.rootless.daemon.settings.features.cdi = true;
is not set byhardware.nvidia-container-toolkit.enable = true;
whenvirtualisation.docker.rootless.enable
istrue
.Docker rootless upstream does not read CDI specifications from
/etc/cdi
nor/var/run/cdi
as expected. This was first reported at https://github.com/NVIDIA/nvidia-container-toolkit/issues/434 and then at https://github.com/moby/moby/issues/47676.$HOME/.docker/run/cdi
though.This issue was split from https://github.com/NixOS/nixpkgs/issues/337873#issuecomment-2332332343.
Add a :+1: reaction to issues you find important.