Open brokeDude2901 opened 2 years ago
@brokeDude2901 sorry for the delay in getting to you. The mechanism that WSL2 uses to include devices into the container is not the same as for native-Linux systems. There is only a single device node (/dev/dxg
) that is included and the traditional NVIDIA_VISIBLE_DEVICES
-based filtering does not work as expected.
We are looking to address this limitation at some point in the future but I don't have a timeline for you.
Any progress on this? Can containers requiring GPU now be used with Podman + WSL2?
Any progress on this? Can containers requiring GPU now be used with Podman + WSL2?
@User-3090 the v1.13.0-rc.3
of the NVIDIA Container Toolkit includes support for generating a CDI specification for NVIDIA devices under WSL2. We should be promoting this version to GA in the next day or two.
To use this:
nvidia-container-toolkit-base
package on your WSL2 distributionsudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
(this should auto-detect that you're on a WSL2 system).podman run --device=nvidia.com/gpu=all ubuntu nvidia-smi -L
Note that only the nvidia.com/gpu=all
device is currently available. Once this restriction is lifted the tooling to generate CDI specifications will be updated to include individual devices.
any news on this? sorry if that is considered necro-bumping, but its driving me insane on Windows. thats one reason why i cant run Windows properly on one of my higher end workstation rigs with multiple GPUs without having to configure a ton of things differently.
Also wondering about the current state of this. Especially when it comes to filtering for specific devices as mentioned here:
Note that only the nvidia.com/gpu=all device is currently available. Once this restriction is lifted the tooling to generate CDI specifications will be updated to include individual devices.
@elezar Do you have any insights here maybe?
On WSL2 with GPU-P, setting NVIDIA_VISIBLE_DEVICES value doesn't work on system with multiple GPUs (2x RTX A5000)
Command: podman run -it --rm -e NVIDIA_VISIBLE_DEVICES=1 tensorflow/tensorflow:latest-gpu-jupyter nvidia-smi
Output: +-----------------------------------------------------------------------------+ | NVIDIA-SMI 510.52 Driver Version: 511.79 CUDA Version: 11.6 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA RTX A5000 On | 00000000:03:00.0 On | Off | |100% 35C P8 26W / 207W | 1659MiB / 24564MiB | 12% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA RTX A5000 On | 00000000:04:00.0 Off | Off | |100% 33C P8 15W / 207W | 0MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
Expected output: Should show only 1 GPU in nvidia-smi
Not sure this is nvidia-container-runtime or microsoft-wsl2 problem.