docker / for-win

Bug reports for Docker Desktop for Windows
https://www.docker.com/products/docker#/windows
1.85k stars 285 forks source link

Stuck process waiting on Fuse #13849

Open lexandera opened 8 months ago

lexandera commented 8 months ago

Description

After switching to a new computer I've been experiencing containerized processes getting stuck a few times per day. The most affected process appears to be Vite/NodeJS.

The output of cat /proc/$PID/stack always looks like this:

[<0>] request_wait_answer+0x15b/0x2b0 [<0>] fuse_simple_request+0x18f/0x2b0 [<0>] fuse_dentry_revalidate+0x138/0x350 [<0>] lookup_fast+0x71/0xe0 [<0>] walk_component+0x1f/0x150 [<0>] link_path_walk.part.0.constprop.0+0x246/0x380 [<0>] path_lookupat+0x3e/0x190 [<0>] filename_lookup+0xed/0x1f0 [<0>] user_path_at_empty+0x3a/0x60 [<0>] do_faccessat+0x11c/0x320 [<0>] do_syscall_64+0x5c/0x90 [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

I'm using Docker in Hyper-V mode since I need working inotify events inside host mounts (they have not been implemented in WSL2).

Usually the process stays stuck until I restart the VM, but I've seen one case where it got unstuck after a couple of minutes while I was investigating.

Reproduce

So far I can't reproduce the issue on demand. It appears to only happen after a period of inactivity (eg: immediately after a call).

I've tried disabling varoius services that might kick in during times of low activity (power saving, Windows antivirus, cloud backup), but have seen no improvement. There also appears to be nothing immediately obvious in Docker logs, Windows event viewer, or the VM's dmesg output.

Expected behavior

FS operations not causing the process to get stuck.

docker version

Client:
 Cloud integration: v1.0.35+desktop.5
 Version:           24.0.7
 API version:       1.43
 Go version:        go1.20.10
 Git commit:        afdd53b
 Built:             Thu Oct 26 09:08:44 2023
 OS/Arch:           windows/amd64
 Context:           default

Server: Docker Desktop 4.26.1 (131620)
 Engine:
  Version:          24.0.7
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.10
  Git commit:       311b9ff
  Built:            Thu Oct 26 09:08:02 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.25
  GitCommit:        d8f198a4ed8892c764191ef7b3b06d8a2eeb5c7f
 runc:
  Version:          1.1.10
  GitCommit:        v1.1.10-0-g18a0cb0
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client:
 Version:    24.0.7
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.12.0-desktop.2
    Path:     C:\Program Files\Docker\cli-plugins\docker-buildx.exe
  compose: Docker Compose (Docker Inc.)
    Version:  v2.23.3-desktop.2
    Path:     C:\Program Files\Docker\cli-plugins\docker-compose.exe
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.0
    Path:     C:\Program Files\Docker\cli-plugins\docker-dev.exe
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.21
    Path:     C:\Program Files\Docker\cli-plugins\docker-extension.exe
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
    Version:  0.1
    Path:     C:\Program Files\Docker\cli-plugins\docker-feedback.exe
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v0.1.0-beta.10
    Path:     C:\Program Files\Docker\cli-plugins\docker-init.exe
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     C:\Program Files\Docker\cli-plugins\docker-sbom.exe
  scan: Docker Scan (Docker Inc.)
    Version:  v0.26.0
    Path:     C:\Program Files\Docker\cli-plugins\docker-scan.exe
  scout: Docker Scout (Docker Inc.)
    Version:  v1.2.0
    Path:     C:\Program Files\Docker\cli-plugins\docker-scout.exe

Server:
 Containers: 12
  Running: 12
  Paused: 0
  Stopped: 0
 Images: 12
 Server Version: 24.0.7
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: d8f198a4ed8892c764191ef7b3b06d8a2eeb5c7f
 runc version: v1.1.10-0-g18a0cb0
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
  cgroupns
 Kernel Version: 6.5.11-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 7.764GiB
 Name: docker-desktop
 ID: 3e084de6-adda-45ff-9410-84da6caf13f7
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: daemon is not using the default seccomp profile

Diagnostics ID

BF8191C3-E265-4D64-AEDE-A7BB6B51A388/20231218212635

Additional Info

The host machine is running Win10 Pro; the VM has had multiple combinations between 4 and 16 cores and 8 to 16GB of RAM assigned to it. I've also purged all the data once through the Troubleshooting pane, as well as diabled all scanning and experimental features.

lexandera commented 8 months ago

After trying everything I could possibly think of, I went through my backups and checked the latest version I had on my old computer - it was 4.25.2.

So I downgraded my current machine to that version and I haven't experienced a single issue so far.

Whatever the reason behind FUSE failues may be, it appears like it has likely been introduced in 4.26.x