docker / for-mac

Bug reports for Docker Desktop for Mac
https://www.docker.com/products/docker#/mac
2.43k stars 117 forks source link

Operations hang after starting system prune #2501

Open acdha opened 6 years ago

acdha commented 6 years ago

I ran docker system prune to clear out old dev images. That appears to have hung with no feedback: commands like docker ps hang and there's no UI indicating what might be the problem. There were periods where Hyperkit was 140% CPU but the running average is close to idle and disk I/O has fluctuated between modest (tens of operations per second) to idle.

It's possible that everything is working correctly but it'd be nice to know whether that's the case and whether I should just leave it running (after about 10 minutes) or if restarting it will be required.

Information

Docker for Mac: version: 17.12.0-ce-mac49 (d1778b704353fa5b79142a2055a2c11c8b48a653)
macOS: version 10.13.3 (build: 17D47)
logs: /tmp/82040D1C-9491-46F7-A12E-B917B287132F/20180124-132812.tar.gz
failure: docker ps failed: (Failure "docker ps: timeout after 10.00s")
[OK]     db.git
[OK]     vmnetd
[OK]     dns
[OK]     driver.amd64-linux
[OK]     virtualization VT-X
[OK]     app
[OK]     moby
[OK]     system
[OK]     moby-syslog
[OK]     kubernetes
[OK]     env
[OK]     virtualization kern.hv_support
[OK]     slirp
[OK]     osxfs
[OK]     moby-console
[OK]     logs
[ERROR]  docker-cli
         docker ps failed
[OK]     menubar
[OK]     disk

Steps to reproduce the behavior

  1. docker system prune
  2. docker ps
acdha commented 6 years ago

I tried to do a factory reset but that also hung. Killing the Docker process caused it to restart with what appears to be a clean install. Again, no UI feedback on what the last step was – the progress bar was over on the 100% side but it stayed there without feedback for at least 10 minutes.

noahd1 commented 6 years ago

I also experienced this behavior. Attempting to use Diagnose & Feedback while docker system prune was running hung as well. Eventually, docker system prune returned at which point the diagnose and feedback results (below) were immediately output.

(Note this was my second attempt at docker system prune -- the first one I gave up on and ended up having to restart Docker)

Docker for Mac: version: 17.12.0-ce-mac55 (18467c0ae7afb7a736e304f991ccc1a61d67a4ab)
macOS: version 10.13.3 (build: 17D102)
logs: /tmp/F4061EC1-D38E-44DE-8F3E-50BE0111AF00/20180309-103211.tar.gz
[OK]     vpnkit
[OK]     vmnetd
[OK]     dns
[OK]     driver.amd64-linux
[OK]     app
[OK]     virtualization VT-X
[OK]     moby
[OK]     system
[OK]     moby-syslog
[OK]     kubernetes
[OK]     env
[OK]     virtualization kern.hv_support
[OK]     moby-console
[OK]     osxfs
[OK]     logs
[OK]     docker-cli
[OK]     disk
calebtote commented 6 years ago

Fyi, mine looked like it was hanging for a while -- I googled around for answers, found this, waited some more, and it finally completed about 20-30 minutes after I initiated the command.

YRM64 commented 6 years ago

A lot of the material I've read indicates one must be very careful using Docker prune. In order to remove unused images "The client and daemon API must both be at least 1.25 to use this command." And, there are Docker prune options that might be a better option to Docker prune, i.e., --all , -a (for removing all unused images, not just dangling ones. This is significant because by default, Docker image prune cleans up dangling images only.

kamil-kielczewski commented 6 years ago

MacOS HighSierra (Docker version 18.03.1-ce, build 9ee9f40):

  1. I run docker images and see 37 images with repository and 52 with <none> repository (so called "dingling" or something...) - total 89 images, average image size: 500 MB

  2. To clean this mess up I run docker system prune -a after 2h of working (without any console message written by docker), I saw that Hyperkit read 9GB and write 8GB, CPU=99% allmost all time.

  3. After this 2 hours (all time I was unable to run docker ps or docker images) I decide to push ctrl+z but ps and images still doesn't work - so after that I restart docker (by its icon menu at the MacOs top bar).

  4. Then I write docker images and I saw that 90% of my images was deleted (what was great, but why so long ! :/ ) last 9 images i delete by docker rmi xxxxx

My proposition: lets make docker system prune write some feedback like % or something during execution... .

Currently I think it is good idea to break docker system prune... (and reset) after each hour and check status using docker images

jleppert commented 6 years ago

Isn't it great when you're stuck waiting at the command line, wondering if this thing has become broken again? Struggling to understand how traversing a tree and deleting some files could be taking so long... Then you check your running processes and wonder to yourself...if the lost soul who wrote this thing couldn't be bothered to output some status text to the user, did he correctly implement signals to clean up and not leave itself in an inconsistent state on restart? The world will never know.

ghost commented 6 years ago

This is awful. I've wasted half a day waiting for it to complete. I eventually cancelled it so hopefully docker will clean up ok.

mwcm commented 6 years ago

ran into this today too, system prune and image prune both resulted in the same behavior

briancaffey commented 6 years ago

If you have lots of untagged images, run the following command:

docker rmi $(docker images -f "dangling=true"-q)

There could potentially be lots of these images, and when you run:

docker system prune

It doesn't show the images as they are being untagged and deleted, so it seems like the docker command is hanging.

From this docker forum post

nickgrealy commented 6 years ago

Just an update to @briancaffey's solution...

Sounds silly, but for me things "seemed" to work better, by copying and pasting a list of docker rmi commands into my console.

I used this to generate the list:

docker images -f "dangling=true" -q | xargs -I {} echo docker rmi -f {}

(I got visual feedback on progress, I could see which image was holding things up, etc etc)

docker-robott commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale comment. Stale issues will be closed after an additional 30d of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows. /lifecycle stale

kamil-kielczewski commented 5 years ago

/remove-lifecycle stale

kamil-kielczewski commented 5 years ago

/lifecycle frozen

Vitiell0 commented 5 years ago

Just wanted to report that I ran into this same problem yesterday and abandoned docker system prune with CTRL + C after letting it run for about an hour with no feedback. This somehow corrupted Docker on my entire system to the point where the Docker application would not open or launch at all.

I spent the good part of a day trying to get Docker to work again; completely uninstalling Docker and reinstalling, restarting, etc. I eventually deleted all the containers and manually removed every file related to Docker and only then did a fresh install get Docker working again.

Janpot commented 5 years ago

For me, this is just really, really slow (running for hours) with no user feedback of whether it's stuck or not. If I run docker system df periodically while the pruning is happening, I see available space slowly increasing. This command also takes a long time to complete while pruning is running. It would be really helpful if it logged some more output, so that I know it's not stuck.

Bellk17 commented 5 years ago

Same Issue

dehzhas commented 5 years ago

Same issue. It seems to me that it's pretty absurd how long this takes. I have to kill and restart often, often with very few images on my system. Sometimes I will have to do a docker reinstall because it can't recover. I tend to run prune very infrequently because I know its going to take a long time. There is no way to tell if it is stuck or just slow. As far as I can tell, it does get stuck sometimes.

gokcan commented 5 years ago

I cannot believe that this feature is marked as frozen.

kamil-kielczewski commented 5 years ago

@gokcan read this https://github.com/docker/for-mac/issues/2501#issuecomment-460875712

gokcan commented 5 years ago

@kamil-kielczewski Is this issue under active development? The frozen label makes it a bit confusing.

kamil-kielczewski commented 5 years ago

@gokcan - I don't know (I not develop docker) - I only read comment #2501 and made frozen to prevent colse

Tails commented 5 years ago

Same issue on 18.09. It already mentioned having deleted containers and networks. No idea what it is doing now, if anything.

dehzhas commented 5 years ago

I have some additional information based on more recent observations. Based on a stack overflow answer, I have been using the following script in stead of the actual prune command:

echo "Pruning containers"
docker rm $(docker ps -f status=exited -aq)

echo "Pruning images"
docker rmi $(docker images -f "dangling=true" -q)

echo "Pruning volumes"
docker volume rm $(docker volume ls -qf dangling=true)

This script will usually complete for me. However, there seem to be a few issues with this approach:

  1. At first pruning images starts of fast. After a few dozen or so it starts to slow down. It goes in little spurts from then on out, with a pretty slow delete rate overall.

  2. It pegs my CPU at 800% (I have docker configured to be able to use all 8 of my cores and 10GB of my memory. I develop containers that do hefty lifting and want to maximize resources available to docker).

  3. When it eventually finishes, docker stays pegged at 800%. Containers continue to function as normal (if a bit slow) but so far it has always become unresponsive after a short while. I have to do a full stop/start of docker to continue. The stop/start process can take a long time. Approximately 20-30% of the time I have to force kill the com.docker.hyperkit process to get it to shut down.

My guess is that the cause of these behaviors is the same as the prune command. Running any of these script or the prune command is the only time in the normal course of my development that docker becomes unresponsive. I've been running docker on my Mac for over 2 years and it seems pretty consistent behavior across all versions I use. I tend to keep docker on the current stable release and have 18.09.2 installed right now.

Hopefully this information is helpful.

louagej commented 5 years ago

Just add --force after prune command and the action will be over before you know docker image prune -a --force

maciej-gurban commented 5 years ago

Experienced the same issue like @Vitiell0. Tried docker system prune --all, no feedback for a long time, terminated the task. Restarted it without --all to see whether it could at least remove something successfully. Hanging again. Tried restarting docker but the whole application hung. Terminated all Docker processes, and tried starting Docker anew but this time the daemon doesn't even start (no icon shows up in top bar on Mac).

I guess it's back to nuking docker from orbit like in the good ole' days ~2 years ago (removing the image file) where this was the only thing that actually guaranteed to get things working without 2h+ debugging. My Docker has access to 3 cores and 6GB RAM.

~Edit: Looks like Docker.qcow2 doesn't exist anymore, so it seems necessary to remove and reinstall Docker nowadays instead.~ Was wrong about this one, for me the file simply wasn't indexed by locate

Edit2: Uninstalling and installing Docker again didn't help. I also needed to purge all Docker files for daemon to start again. Boils down to removing the following directories:

~/Library/Application Scripts/com.docker.helper
~/Library/Caches/com.docker.docker
~/Library/Containers/com.docker.docker
~/Library/Containers/com.docker.helper
ChefAndy commented 5 years ago

😐 Getting some feedback when using this command would be great.

amajedi commented 5 years ago

I've run into this in the past and just ran into it now after running docker system prune.

cannontrodder commented 5 years ago

Surprised this is not being addressed especially when the following just works:

docker rmi $(docker images -q)

dentarg commented 5 years ago

macOS 10.14.6 (18G103) and Docker Desktop 2.1.0.3 (38240) (Engine: 19.03.2).

I ran docker container prune -f, Docker started eating lots of CPU, nothing happened for hours, but I thought I would give it some time. Other docker commands would just hang. After 21h 20m I did ^C. Activity Monitor still reported the com.docker.supervisor process eating ~300% CPU and the Docker process to be "Not Responding". Also the Docker menu bar icon started beachballing (when I hoovered over it).

I force-quit the Docker process, and started Docker. Now docker commands works again. The stopped containers have not been removed.

Vitiell0 commented 4 years ago

@mikeparker is docker for mac still supported?

dissolved commented 4 years ago

I can't believe this is still an issue...

maciej-gurban commented 4 years ago

Still an issue. Happened to me just yesterday and requires a full computer restart to calm docker down from using all the available CPU

tvongeldern commented 4 years ago

This still happens for me. I usually just hit ctrl + c a few seconds after it hangs

polymorpher commented 4 years ago

I have the same issue. Although I can see disk space slowly gets freed up, the lack of any indication for when the pruning process will finish is really bothersome. The command docker image prune locks down everything related to docker and prevents any other commands from being executed. It's very disruptive.

bradfox2 commented 4 years ago

Please fix this issue.

fabu-khiarah commented 4 years ago

+1

manuelportela commented 4 years ago

+1

nzcodarnoc commented 4 years ago

I'm having this issue with

macOS 10.15.4

Client: Docker Engine - Community
 Version:           19.03.8
 API version:       1.40
 Go version:        go1.12.17
 Git commit:        afacb8b
 Built:             Wed Mar 11 01:21:11 2020
 OS/Arch:           darwin/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.8
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.17
  Git commit:       afacb8b
  Built:            Wed Mar 11 01:29:16 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.2.13
  GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683
samirsss commented 4 years ago

+1 to having more feedback, since on my mac and linux (Ubuntu 16.04) server a lot of instances feel like this is just hung. Thanks!

grzesikluk commented 3 years ago

Same here MacOs 10.15.7 - docker version 3.0.1 (50773) after docker container prune hangs.

ashkan-pm commented 3 years ago

Ubuntu 20.04 docker system prune reliably and completely crashes the system.

acdha commented 3 years ago

Ubuntu 20.04 docker system prune reliably and completely crashes the system.

Since this is an issue for the Mac Docker desktop app, that sounds like it should be a new issue — and since it has a reliable crash you should be able to get logs to further narrow down the issue. One of the challenges on this issue is that it doesn't have a simple root-cause indicator.

devondragon commented 3 years ago

MacOS 11.2.3 running the latest version of docker. Ran "docker image prune" this morning, it's been using 100-150% CPU for 6+ hours, no output or feedback. Trying to run any docker command in another window hangs. Docker Desktop App shows no containers or images. I am about to kill the prune task and see if I've lost all my stuff or not.... Sample attached.

This seems like a pretty serious issue to have open for years:( Even if root cause(s) are hard to find, at least adding in some basic progress/output/info to the command seems like a simple improvement...

Sample of com.docker.hyperkit.txt

dickson-tec commented 3 years ago

I just gave it a bit of patience and after 20 minutes i reclaimed 68.6GB of disk space

ChefAndy commented 3 years ago

I just gave it a bit of patience and after 20 minutes i reclaimed 68.6GB of disk space

Maybe it was silently or unintentionally fixed? Maybe I'll give it another shot.

The several times I tried it, I let it run for several hours before sending it a ctrl-c, and doing that hung the whole system. I doubt I had more than 100GB in excess images on that 500GB laptop hard drive. I wonder if this has something to do with the amount of available disk space? I was probably low-ish on disk space which is why I was running the task to begin with. I could see running out of disk space having an effect like this if they were doing some sort of disk caching or whatever— I don't think I checked.

Either way, I personally I wouldn't make a utility that ran for 1 minute, let alone hours, without a status update. Jeez, I even do that for utilities I write for myself just so I know it's working as intended. At least a warning that there wasn't going to be a status update for the long-running task if giving real time updates isn't feasible for some reason.

Also, if it's going to kill your system when you try to exit it, that should be something the user should be informed of and able to opt out of before you pass the point of no return. It's not like you're formatting a partition here— from a user perspective, it seems like it should be about as intensive as emptying the trash on your OS rather than some involved process that takes forever to complete.

ebuildy commented 3 years ago

This is problematic when using CI tool such gitlab, this give timeout (or we lose SSH connection),

What about a quick win solution: "add a command flag to specify how many images/volumes can be deleted" or "add a timeout"

So we can run the prune command multiple time without any timeout.

timakamystery commented 3 years ago

same - it does the job but not intuitive as stated above and takes long time - for docker system prune -f

ndep@ndepbuild0:/var/lib/docker$ docker version
Client:
 Version:           20.10.2
 API version:       1.41
 Go version:        go1.13.8
 Git commit:        20.10.2-0ubuntu1~18.04.3
 Built:             Fri Jul 23 21:07:37 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server:
 Engine:
  Version:          20.10.2
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.8
  Git commit:       20.10.2-0ubuntu1~18.04.3
  Built:            Fri Jul 23 19:36:13 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.5.2-0ubuntu1~18.04.2
  GitCommit:
 runc:
  Version:          1.0.0~rc95-0ubuntu1~18.04.2
  GitCommit:
 docker-init:
  Version:          0.19.0
  GitCommit:
sneko commented 2 years ago

Still the same in 2022 🤔

SakeebHossain commented 1 year ago

This was shared earlier in this thread but including the -f was what finally did it for me.

docker rmi $(docker images -q) -f

emmeowzing commented 1 year ago

I'm gonna bake this issue a cake for its 6th birthday:D