docker / for-win

Bug reports for Docker Desktop for Windows
https://www.docker.com/products/docker#/windows
1.86k stars 289 forks source link

Cannot run docker images with GPU support after upgrading to 4.17.1 (101757) #13333

Closed peterbandi closed 1 year ago

peterbandi commented 1 year ago

Actual behavior

Upgrading to Docker Desktop 4.17.1 (101757) breaks the nVidia GPU support on WSL2. After the upgrade from 4.17.0 to 4.17.1 starting a docker container with --gpus=all option makes the docker container hang. It is not possible to interact with it or exit it. The container is now shown in Docker Desktop or with docker container list or docker ps commands.

The symptoms are the same if the container is started with docker compose. It sticks in a state where it is not listed with the aforementioned docker command line commands. Docker Dekstop shows the container with orange color in the swarm in "Created" state even after the swarm is shut down. It is not possible to shut it down or delete it without restarting Docker Desktop.

Expected behavior

Being able to run docker containers with attached GPUs (--gpus option) as before.

Information

Output of & "C:\Program Files\Docker\Docker\resources\com.docker.diagnose.exe" check

[2023-03-24T15:09:35.913405700Z][com.docker.diagnose.exe][I] set path configuration to OnHost
Starting diagnostics

[PASS] DD0027: is there available disk space on the host?
[PASS] DD0028: is there available VM disk space?
[PASS] DD0002: does the bootloader have virtualization enabled?
[SKIP] DD0018: does the host support virtualization?
[PASS] DD0001: is the application running?
[PASS] DD0022: is the Virtual Machine Platform Windows Feature enabled?
[PASS] DD0021: is the WSL 2 Windows Feature enabled?
[PASS] DD0024: is WSL installed?
[PASS] DD0025: are WSL distros installed?
[PASS] DD0026: is the WSL LxssManager service running?
[PASS] DD0029: is the WSL 2 Linux filesystem corrupt?
[PASS] DD0035: is the VM time synchronized?
[PASS] DD0017: can a VM be started?
[PASS] DD0016: is the LinuxKit VM running?
[PASS] DD0011: are the LinuxKit services running?
[PASS] DD0004: is the Docker engine running?
[PASS] DD0015: are the binary symlinks installed?
[PASS] DD0031: does the Docker API work?
[PASS] DD0013: is the $PATH ok?
[PASS] DD0003: is the Docker CLI working?
[PASS] DD0005: is the user in the docker-users group?
[PASS] DD0038: is the connection to Docker working?
[PASS] DD0014: are the backend processes running?
[PASS] DD0007: is the backend responding?
[PASS] DD0008: is the native API responding?
[PASS] DD0009: is the vpnkit API responding?
[PASS] DD0010: is the Docker API proxy responding?
[PASS] DD0006: is the Docker Desktop Service responding?
[SKIP] DD0030: is the image access management authorized?
[PASS] DD0033: does the host have Internet access?
[PASS] DD0002: does the bootloader have virtualization enabled?
[PASS] DD0018: does the host support virtualization?
[PASS] DD0001: is the application running?
[PASS] DD0022: is the Virtual Machine Platform Windows Feature enabled?
[PASS] DD0021: is the WSL 2 Windows Feature enabled?
[PASS] DD0024: is WSL installed?
[PASS] DD0025: are WSL distros installed?
[PASS] DD0026: is the WSL LxssManager service running?
[PASS] DD0029: is the WSL 2 Linux filesystem corrupt?
[PASS] DD0035: is the VM time synchronized?
[PASS] DD0017: can a VM be started?
[PASS] DD0016: is the LinuxKit VM running?
[PASS] DD0011: are the LinuxKit services running?
[PASS] DD0004: is the Docker engine running?
[PASS] DD0015: are the binary symlinks installed?
[PASS] DD0031: does the Docker API work?
[PASS] DD0032: do Docker networks overlap with host IPs?
No fatal errors detected.

Steps to reproduce the behavior

  1. Run: docker pull nvidia/cuda:11.8.0-base-ubuntu20.04
  2. Run: docker run --interactive --tty --rm nvidia/cuda:11.8.0-base-ubuntu20.04 echo "Hello"
  3. See that it outputs Hello and exits
  4. Run: docker run --interactive --tty --rm --gpus=all nvidia/cuda:11.8.0-base-ubuntu20.04 echo "Hello"
  5. See that
    • It hangs, and it is not possible to interact with the container any more
    • It is not possible to stop it by CTRL + C
    • No running container is listed in Docker Desktop
    • No running container is listed with docker container list command
    • The only way to close it is to close the terminal window that it is started from
peterbandi commented 1 year ago

There is already an existing ticket for this issue: https://github.com/docker/for-win/issues/13324

docker-robott commented 1 year ago

Closed issues are locked after 30 days of inactivity. This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

/lifecycle locked