docker / for-mac

Bug reports for Docker Desktop for Mac
https://www.docker.com/products/docker#/mac
2.43k stars 118 forks source link

Docker desktop and engine hang with high CPU usage after computer wakes up from sleep #6655

Open sknop opened 1 year ago

sknop commented 1 year ago

Expected behaviour

Docker desktop does not deadlock when the computer wakes up from a sleep state, as is common for a laptop.

Actual behaviour

When the laptop wakes up from sleep (moving a mouse, pressing a key), Docker starts consuming 100%-200% CPU and is utterly unresponsive. All docker commands fail, the docker desktop itself is unresponsive and cannot be quit either. The only way I could get the Docker Desktop app to stop is to use Force Quit on my Mac.

Information

This problem appeared after the latest Docker update to the newest version. I ran various diagnostics on docker when it is in that state, and when the diagnostics tool hang, I stopped it via Control-C, which in this instance actually reawakened Docker from its deadlock, so I assume it is a single thread that is deadlocked, and some activity seems to be able to stop that thread, but I have not been able to consistently verify this.

The problem is consistently reproducible: if I force quit the Docker App, restart it again and put the laptop to sleep, Docker will hang when I reawken the laptop again every time. I first did not notice this until I heard the fan spinning and my CPU meter showing consistent load.

Output of /Applications/Docker.app/Contents/MacOS/com.docker.diagnose check

Starting diagnostics

[PASS] DD0027: is there available disk space on the host? [PASS] DD0028: is there available VM disk space? [PASS] DD0018: does the host support virtualization? [PASS] DD0001: is the application running? [PASS] DD0017: can a VM be started? [PASS] DD0016: is the LinuxKit VM running? [PASS] DD0011: are the LinuxKit services running? [PASS] DD0004: is the Docker engine running? [PASS] DD0015: are the binary symlinks installed? [PASS] DD0031: does the Docker API work? [PASS] DD0013: is the $PATH ok? [PASS] DD0003: is the Docker CLI working? [PASS] DD0014: are the backend processes running? [PASS] DD0007: is the backend responding? [PASS] DD0008: is the native API responding? [PASS] DD0009: is the vpnkit API responding? [PASS] DD0010: is the Docker API proxy responding? [FAIL] DD0012: is the VM networking working? network checks failed: failed to ping host: exit status 1 [2022-12-30T19:56:55.112506000Z][com.docker.diagnose][I] ipc.NewClient: 35900a4e-diagnose-network -> diagnosticd.sock diagnosticsd [common/pkg/diagkit/gather/diagnose.runIsVMNetworkingOK() [ common/pkg/diagkit/gather/diagnose/network.go:34 +0xd9 [common/pkg/diagkit/gather/diagnose.(*test).GetResult(0x100d5bbc0) [ common/pkg/diagkit/gather/diagnose/test.go:46 +0x43 [common/pkg/diagkit/gather/diagnose.Run.func1(0x100d5bbc0) [ common/pkg/diagkit/gather/diagnose/run.go:17 +0x5a [common/pkg/diagkit/gather/diagnose.walkOnce.func1(0x2?, 0x100d5bbc0) [ common/pkg/diagkit/gather/diagnose/run.go:142 +0x77 [common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x1, 0x100d5bbc0, 0xc0005d3720) [ common/pkg/diagkit/gather/diagnose/run.go:151 +0x87 [common/pkg/diagkit/gather/diagnose.walkDepthFirst(0x0, 0x100d5bd40, 0xc0005d3720) [ common/pkg/diagkit/gather/diagnose/run.go:148 +0x52 [common/pkg/diagkit/gather/diagnose.walkOnce(0x10070dba0?, 0xc00059f888) [ common/pkg/diagkit/gather/diagnose/run.go:137 +0xcc [common/pkg/diagkit/gather/diagnose.Run(0x100d5bd40, 0x600?, {0xc00059fb18, 0x1, 0x1}) [ common/pkg/diagkit/gather/diagnose/run.go:16 +0x1d4 [main.checkCmd({0xc0001b8010?, 0x6?, 0x4?}, {0x0, 0x0}) [ common/cmd/com.docker.diagnose/main.go:133 +0x105 [main.main() [ common/cmd/com.docker.diagnose/main.go:99 +0x2a7 [2022-12-30T19:56:55.113557000Z][com.docker.diagnose][I] (99e8e891) 35900a4e-diagnose-network C->S diagnosticsd POST /check-network-connectivity: {"ips":["192.168.2.82","192.168.2.85"]} [2022-12-30T19:56:55.633103000Z][com.docker.diagnose][W] (99e8e891) 35900a4e-diagnose-network C<-S a967aa94-diagnosticsd POST /check-network-connectivity (519.737145ms): failed to ping host: exit status 1

[SKIP] DD0030: is the image access management authorized? [PASS] DD0019: is the com.docker.vmnetd process responding? [PASS] DD0033: does the host have Internet access? [PASS] DD0018: does the host support virtualization? [PASS] DD0001: is the application running? [PASS] DD0017: can a VM be started? [PASS] DD0016: is the LinuxKit VM running? [PASS] DD0011: are the LinuxKit services running? [PASS] DD0004: is the Docker engine running? [PASS] DD0015: are the binary symlinks installed? [PASS] DD0031: does the Docker API work? [PASS] DD0032: do Docker networks overlap with host IPs?

Please investigate the following 1 issue:

1 : The test: is the VM networking working? Failed with: network checks failed: failed to ping host: exit status 1

VM seems to have a network connectivity issue. Check your host firewall and anti-virus settings in case they are blocking the VM.

(I have hidden network mode - blocking ICMP calls - enabled as enforced by my employer)

Steps to reproduce the behaviour

kneden commented 1 year ago

I have seen this several times in the past week or so. Initially with DD 4.15.0 & macOS 12.6.1, but now with DD 4.16.1 & macOS 12.6.2 (updated both in the hopes of making the issue go away). The last time it happened, I had no containers running.

When it occurs, the 'Virtual Machine Service' process consumes anywhere from 75% - 225% CPU (Activity Monitor).

I am able to get out of the state and back to sane using these steps:

  1. 'Restart' from DD menu bar menu.
  2. Wait a few minutes for the CPU usage to drop. Haven't thoroughly tested, but I think the Virtual Machine Service process exits, rather than just consuming less CPU.
  3. 'Quit' from the DD menu bar menu.
  4. Re-launch DD application.
raulfragoso commented 1 year ago

I've been experiencing the same issue for the last few weeks, now with MacOS Ventura 13.1, which I upgraded from Monterey with the hope of correcting the problem that was also present before.

The solution is always the same: quit DD from the menu bar, kill the docker process in the terminal with 'pkill docker', then restart DD and start the container/s that were already running when the computer went to sleep.

olance commented 1 year ago

I've had the same issue and found that it does not happen anymore by disabling the virtualization framework.

So it's worth a try:

In my case, Docker would completely hang after even a few seconds of system sleep. After changing those settings it does not happen anymore!

raulfragoso commented 1 year ago

I've had the same issue and found that it does not happen anymore by disabling the virtualization framework.

So it's worth a try:

  • Go to Docker settings
  • Set the file sharing to FuseFS
  • uncheck the virtualization framework just above
  • apply & restart Docker

In my case, Docker would completely hang after even a few seconds of system sleep. After changing those settings it does not happen anymore!

That fixed the issue! After changing the file sharing to gRPC FUSE and disabling the Virtualization framework, I can now put my Mac to sleep and the container will be properly running when it's awake again. Thank you!

olance commented 1 year ago

That fixed the issue! After changing the file sharing to gRPC FUSE and disabling the Virtualization framework, I can now put my Mac to sleep and the container will be properly running when it's awake again. Thank you!

Great! 😁 You're very welcome, this was driving me crazy as well!

sallespro commented 1 year ago

i was having this issue with Docker version 20.10.23, build 7155243 running on OSX Ventura. 13.2.1

rkettelerij commented 1 year ago

Can conform, had this issue on macOS Monterey on an Intel Mac after a recent Docker upgrade (to 20.10.23, commit 7155243). Following the workaround described in this issue (disabling virtualisation framework) works.

sneko commented 1 year ago

Tried the workaround with gFUSE and by disabling the virtualization framework... no improvement, it's even worst.

Now the process com.docker.hyperkit has the high CPU usage :( . Hope to find a solution one day... it worked at a time but don't remember the perfect combinaison.

sneko commented 1 year ago

Radical choice adopted! I had issues with Docker having unstable performance on Windows years ago, and on MacOS too randomly since then. I'm done 😄

Just decided to go with Podman (already used with success on some servers to benefit from the rootless feature):

  1. Install "Podman Desktop"
  2. Init & Start
  3. Install podman-compose
  4. Set aliases to keep all your commands and tools:
    alias docker=podman
    alias docker-compose=podman-compose

Can't say right now it will be fully compatible in all situations, but at least I have a decent CPU usage (which is required to have a smooth development experience 😉 ).

jorgesisco commented 9 months ago

My issue is when I build my container using docker-compose, tasks like com.docker.b, docker-scout and com.apple.Vi keep running taking all the cpu, when I restart docker, it fixes the issue. By setting file sharing to gRPC FUSE is not helping the issue remains, what could it be?

tpoxa commented 9 months ago

Same here.
Macbook M1 Macos 13.4.1 (22F82) Docker Desktop 4.26.1 (131620) Kubernetes is on Docker's dashboard shows NaN% / 200% (2 cores available) Activity monitors shows constant 200-230% CPU usage by Docker.

Docker restart helps till the next night morning. So I restart it on a daily basics.

InvisibleProgrammer commented 8 months ago

I have the same issue. Docker restart is the only solution.