WSL2 with GPU-P, NVIDIA_VISIBLE_DEVICES value doesn't work

brokeDude2901 commented 2 years ago

On WSL2 with GPU-P, setting NVIDIA_VISIBLE_DEVICES value doesn't work on system with multiple GPUs (2x RTX A5000)

Command: podman run -it --rm -e NVIDIA_VISIBLE_DEVICES=1 tensorflow/tensorflow:latest-gpu-jupyter nvidia-smi

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

Expected output: Should show only 1 GPU in nvidia-smi

Not sure this is nvidia-container-runtime or microsoft-wsl2 problem.

elezar commented 2 years ago

@brokeDude2901 sorry for the delay in getting to you. The mechanism that WSL2 uses to include devices into the container is not the same as for native-Linux systems. There is only a single device node (/dev/dxg) that is included and the traditional NVIDIA_VISIBLE_DEVICES-based filtering does not work as expected.

We are looking to address this limitation at some point in the future but I don't have a timeline for you.

User-3090 commented 1 year ago

Any progress on this? Can containers requiring GPU now be used with Podman + WSL2?

elezar commented 1 year ago

Any progress on this? Can containers requiring GPU now be used with Podman + WSL2?

@User-3090 the v1.13.0-rc.3 of the NVIDIA Container Toolkit includes support for generating a CDI specification for NVIDIA devices under WSL2. We should be promoting this version to GA in the next day or two.

To use this:

install the nvidia-container-toolkit-base package on your WSL2 distribution
run sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml (this should auto-detect that you're on a WSL2 system).
request the available device(s) under podman: podman run --device=nvidia.com/gpu=all ubuntu nvidia-smi -L

Note that only the nvidia.com/gpu=all device is currently available. Once this restriction is lifted the tooling to generate CDI specifications will be updated to include individual devices.

Tsubajashi commented 3 months ago

any news on this? sorry if that is considered necro-bumping, but its driving me insane on Windows. thats one reason why i cant run Windows properly on one of my higher end workstation rigs with multiple GPUs without having to configure a ton of things differently.

rhochmayr commented 3 days ago

Also wondering about the current state of this. Especially when it comes to filtering for specific devices as mentioned here:

Note that only the nvidia.com/gpu=all device is currently available. Once this restriction is lifted the tooling to generate CDI specifications will be updated to include individual devices.

@elezar Do you have any insights here maybe?

NVIDIA / nvidia-container-toolkit

WSL2 with GPU-P, NVIDIA_VISIBLE_DEVICES value doesn't work #70