Closed gaetancollaud closed 3 years ago
I agree that this is not a satisfactory situation - out of curiosity, if you do a factory reset, do you get back to a working system?
Thanks @friism for the workaround. It worked ! I just lost my settings, but that was expected and it didn't took me long to configure them again.
I saved the hyper-v disk on an external drive if someone is interested in making more investigations (I can send it by dropbox or other).
Next time run docker image prune -f
to remove unused images.
@hinell As said in my ticket description, I already prune everything. the -f
argument only skips the prompt.
@gaetancollaud Ohh yep sorry just missed that part. Seems like the memory leak.
I am encountering this limitation as well. For me the issue is that we are working with large Oracle images populated with test data. Why is it not possible to extend the disk on the MobyLinuxVM? If there's no UI for it, at least give us the ability to access the VM to extend the disk!
@vmarinelli I agree. Inability to access native mobyvm is frustrating so I'm forced to use stand-alone virtual machine in particular cases.
Hi @vmarinelli @hinell We do have an open enhancement request around the hard coded MobyLinuxVM size, but I don't have an ETA on when that will make it onto the roadmap. There are a few options we're looking at. For the time being, we do have a manual way to extend the MobyLinuxVM. Manual changes made to these scripts will be overwritten during upgrades or reinstalls though, so please keep that in mind.
Stop Docker
In Windows explorer go to C:\Program Files\Docker\Docker\Resources and edit MobyLinux.ps1
It's in Program Files, so this will require admin privileges to edit - find line 86
$global:VhdSize = 60* 1024* 1024*1024
Change the 60 to number of gb you want allocated to the drive.
Restart Docker
@jasonbivins Thanks for the tip.
But it seems like there is a memory leak somewhere. Increasing the disk has no use because we will reach the physical disk max size at some point.
Are you interested in having the hyper-v disk that I saved just before my reset factory ?
Even after editing MobyLinux.ps1 with VhdSize of 100GB and restarted Docker, nothing changed. Inspecting VHD from Hyper-V correctly says "Current File Size 60GB" "Maximum Disk Size 100GB", but any docker container reports 100% disk usage with a size of 55G and isn't able to create any file. I have also pruned about 2GB of containers but space left is always the same. I don't want to reset to factory default nor to lose downloaded images. How could anyone use Docker for Windows in production if you need to reset it from time to time to restore leaked space?
@jasonbivins Thanks very much for that workaround. I was able to get the disk size increased, but I had to add one additional step to your instructions to make it work. After restarting Docker, I had to click "Reset to Factory Defaults". Then when the MobyLinuxVM was rebuilt, it came up with the larger volume.
I made a post to the Forums to document this workaround and linked it back to this issue.
@gaetancollaud While I have enountered the "Out of space on device" error repeatedly while trying to restore large Oracle containers to my D4Win install, I haven't encountered the issue of not being able to free up space even after deleting images and containers and running prune. So for me, the workaround provided addresses my issue. I will keep an eye out for this now that I am able to work and will post here if I encounter the same behavior.
@Jamby93 TIP: use docker save > foo.img
and docker load < foo.img
to import and export your images respectively if you don't want do download them again after resetting.
@hinell Yes, and also backup modified containers, volumes, registry and swarm configuration, networks and so on. That's simply unaffordable on production environment (let's talk about CD infrastructure that builds ton of images a day). All for a bug that simply as no sense at all. I mean; so far nobody has really proposed a reason Why this bug could happen, instead of simply trying to work-around and forget it. I'm used to try to understand why an issue is occurring instead of only fixing short-time-specific problems that will come back again one day or another. I'm willing to help gathering information if that could help addressing the issue.
I have this as well with larger image content files.
Here is an easy way to retest:
A bash script to create 2x8GB files
#!/bin/bash
dd if=/dev/zero of=file1.dat bs=1M count=8000
dd if=/dev/zero of=file2.dat bs=1M count=8000
Dockerfile:
FROM ubuntu:latest
WORKDIR /tmp
ADD . /tmp
Try to build image:
docker build . -t foo
Output:
Sending build context to Docker daemon 16.78GB
Error response from daemon: Error processing tar file(exit status 1): write /file2.dat: no space left on device
Analysis
Docker version: Client: Version: 17.06.1-ce API version: 1.30 Go version: go1.8.3 Git commit: 874a737 Built: Thu Aug 17 22 OS/Arch: windows/amd64
Server: Version: 17.06.1-ce API version: 1.30 (minimum Go version: go1.8.3 Git commit: 874a737 Built: Thu Aug 17 22 OS/Arch: linux/amd64 Experimental: true
Docker info:
Containers: 267 Running: 0 Paused: 0 Stopped: 267 Images: 150 Server Version: 17.06.1-ce Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Native Overlay Diff: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170 runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2 init version: 949e6fa Security Options: seccomp Profile: default Kernel Version: 4.9.41-moby Operating System: Alpine Linux v3.5 OSType: linux Architecture: x86_64 CPUs: 5 Total Memory: 22.99GiB Name: moby ID: MCC6:TLLN:GJGF:FFS7:XBYR:2JPF:OYK3:4CWU:ZHWB:HDQY:S3VP:TCZ2 Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): true File Descriptors: 15 Goroutines: 26 System Time: 2017-09-05T06:05:02.8221668Z EventsListeners: 0 Registry: https://index.docker.io/v1/ Experimental: true Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false
I managed to upgrade the disk size in Hyper-V GUI.
@cybertk That did not work properly for me. Sure, the MobyLinuxVM reported a larger "max disk size" in Hyper-V manager afterwards, but the overlay inside of containers that were then created was still 60GB max:
[root@db2server /]# df
Filesystem 1K-blocks Used Available Use% Mounted on
overlay 61664044 3899176 54602808 7% /
It wasn't until I modified MobyLinux.ps1 and "reset to factory default" as described by @jasonbivins and @vmarinelli that the overlay was increased.
@jasonbivins, what's the status of the enhancement mentioned at "https://github.com/docker/for-win/issues/1042#issuecomment-326416863"?
@NdubisiOnuora This has been added to the edge channel. We are also working on some improvements to automatically reclaim space, but I don't have an ETA for those.
having the same issue on Windows10, I am using oracle12c image hosted at https://hub.docker.com/r/sath89/oracle-12c/
everytime I do a commit or start the container again the MobyVM file keeps on increasing the size until it gets above 60GB. After that you cannot commit, save, load or even start a container.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
comment.
Stale issues will be closed after an additional 30d of inactivity.
Prevent issues from auto-closing with an /lifecycle frozen
comment.
If this issue is safe to close now please do so.
Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows. /lifecycle stale
/remove-lifecycle stale
problem is still present on:
Client:
Version: 18.03.1-ce
API version: 1.37
Go version: go1.9.5
Git commit: 9ee9f40
Built: Thu Apr 26 07:12:48 2018
OS/Arch: windows/amd64
Experimental: false
Orchestrator: swarm
Server:
Engine:
Version: 18.05.0-ce
API version: 1.37 (minimum version 1.12)
Go version: go1.10.1
Git commit: f150324
Built: Wed May 9 22:20:42 2018
OS/Arch: linux/amd64
Experimental: false
What is the status of stopping the disk leak? This should be a rather high-priority issue given the impact, right?
Still no fix? I'm running version 2.0.0.0-win81 (29211) from "stable". Engine is 18.09.0 I have 61.42GB used of 512GB but I can't even run a docker pull command without getting the "no space left on device" error. Restarting docker gives some relief but not for long.
Actually - Restarting docker has just dropped all of the images! Could this be an indication of corruption? This means I have to rebuild them all which takes a very long time so this is not an adequate workaround.
I'm running Version 2.0.0.0-win81 (29211) from "stable". Engine is 18.09.0.
I just resized from 60GB to 200GB via the "Advanced" settings, and everything looks fine for me. Containers are seeing the new space.
@jasonbivins what is the solution on a cloud9 linux environment where I don't have access to the gui, just bash.
Fwiw: I debugged this issue using docker run --net=host --ipc=host --uts=host --pid=host -it --security-opt=seccomp=unconfined --privileged --rm -v /:/host alpine /bin/sh
followed by a chroot /host
. You can then check out the filesystem of the moby linux host. All relevant data is in /var/lib/docker
. I traced my issue to sql server transaction logs that kept filling up the volumes.
still happening in 2019
+1 still happening
I ran into this on OSX, docker 2.0.0.3. I was able to resolve it by restarting docker and quickly opening preferences, going to disk settings and increasing disk size from 64GB to 320GB and hitting apply. That startup failed after a minute but the next restart worked correctly.
I'm surprised this isn't handled more gracefully. I feel bad for the people that have blown away their images to resolve this issue.
on my mac remove unused docker images solved the issue for me. docker image prune -f
+1
What did help with my case was shortening the path of the "disk image location". I changed it to "C:\Docker\DockerDesktop.vhdx". Removing the disk image did not help, changing the path did the trick...
+1 Still Happening
docker system prune -a -f is not fixing it, I am running docker Server Version: 18.09.5 Storage Driver: overlay2 Backing Filesystem: xfs Supports d_type: true getting the error no space left of device
docker: Error response from daemon: error creating overlay mount to /opt/app/k8s-docker/overlay2/bfbabf0ea0b47485fbabc0d5c43d5b5d72d247b7da42c72d5cd36ead59ac6cae-init/merged: no space left on device. I am using /opt/app for writing docker data and it has a lot of space .
Filesystem Size Used Avail Use% Mounted on /dev/mapper/app_vg-app_lv 3.1T 352G 2.8T 12% /opt/app [enabler@zlp44493 etc]$ df -ih /opt/app Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/app_vg-app_lv 313M 148K 313M 1% /opt/app
Specifications Docker version 17.03.1-ce-rc1, build 3476dbf Windows 10 Pro Docker Desktop Version 17.03.1-ce-rc1-win3 (10625) Channel: edge
Had the same issue with this Dockerfile
FROM joyzoursky/python-chromedriver:3.7-alpine3.8-selenium
RUN apk update && apk add gcc libc-dev g++ libffi-dev libxml2 libffi-dev unixodbc-dev mariadb-dev postgresql-dev freetds-dev vim
RUN pip install requests pyodbc selenium
ADD odbcinst.ini ./etc/
ADD annualinsightapis.py ./home/
ADD setup.py ./home/
ADD crawler.py ./home/
ADD downloader.py ./home/
ADD main.py ./home/
CMD ["python", "./home/main.py"]
Which made no sense since I deployed a container 5 minutes ago with the same configuration, the output it gave me was
ERROR: Failed to create usr/share/vim/vim81/doc/options.txt: No space left on device
ERROR: vim-8.1.1365-r0: No space left on device
Which also made no sense cause I had space for it.
What worked for me was changing the memory in the advanced options of docker desktop:
Original:
After I changed it
Docker restarted after that and I built my image successfully. After changing the size to the original size again, building the image was no problem, was successful.
-- Edit: added channel
is this related to https://github.com/docker/for-win/issues/244 ? it also discusses some workarounds
Is there a clear and definitive solution to this?!
Over in Issue 244, @ThisIsABug just posted an interesting set of Powershell commands that shrunk my DockerDesktop.vhdx considerably:
docker system prune -a -f
net stop com.docker.service
taskkill /F /IM "Docker Desktop.exe"
stop-vm DockerDesktopVM
Optimize-VHD -Path "C:\Users\Public\Documents\Hyper-V\Virtual hard disks\DockerDesktop.vhdx" -Mode Full
start-vm DockerDesktopVM
start "C:\Program Files\Docker\Docker\Docker Desktop.exe"
net start com.docker.service
I did have to change the path in the Optimize-VHD step, as my DockerDesktop.vhdx is in C:\ProgramData\DockerDesktop\vm-data
Do not forget to execute my solution in an elevated console :) PS: it may not be the finest solution but I can just execute it once in a month or two and that's it, no further worries PPS: you omitted the last line, it actually had a purpose too, when you just copy & paste this into an console, the last line will not execute automatically, so you have to hit enter one more time when the rest of the script is finished, which is irrelevant if you put it in a file, I know ;) PPPS: you better put this in a file, I still have hopes that this will be fixed soon, but who am I talking too? This issue is being tracked since 3 years ...
docker image prune -f
Total reclaimed space: 40.66GB
🤣
Would be nice with a fix, as someone pointed out. If this is not a serious issue then what is?
Makes this unusable in an enterprise environment.
This issue is still occurring in 2020.
What are the actionable steps to:
Since this issue makes docker for windows unusable, and since it has been open for many years, can any of the tagged contributors please make a comment? Is docker for windows actively maintained? @mikeparker @rn @akimd @samoht @macdja38 @duglin @dave-tucker @akshaybabloo @MagnusS
Links to hundreds of people have this same issue:
The workaround to "nuclear option" wipe docker every-time this issue is encountered cripples productivity and is not a solution.
Recent Diagnostic ID (mine): CE47E764-D69E-4341-8037-4D9F978ECF6A/20200122200602
Other existing Diagnostic IDs:
Similiar issue exists for docker for mac: https://github.com/docker/for-mac/issues/371#issuecomment-242047368
Let me try to shed some light on this.
A VM disk is like any other disk. If its full, you can't put anything else on it. You can fill up your disk in many ways just like any other disk. This isn't a bug with docker, it's simply your containers, images and files/volumes filling up the disk. (e.g. see https://github.com/docker/for-win/issues/1042#issuecomment-326609525 I haven't encountered the issue of not being able to free up space even after deleting images and containers and running prune
and https://github.com/docker/for-win/issues/1042#issuecomment-485698417 I traced my issue to sql server transaction logs that kept filling up the volumes.
, https://github.com/docker/for-win/issues/1042#issuecomment-517333153 remove unused docker images solved the issue for me.
) Also see pretty much every accepted answer in the stack overflow questions in the comment above. They are all effectively 'free up disk space by deleting things' or 'make your disk bigger'.
The VM disk can be filled with multiple different types of things, the most likely is images, as per the OP. The 2nd most likely thing is a folder on disk which a container has been spamming large files (e.g. logs) to. This is a 'volume'. Other things on disk include stopped containers themselves and networks.
If you want to clean your disk from the command line without checking what you're deleting, you can use docker system prune
, this command also has many options, see https://docs.docker.com/engine/reference/commandline/system_prune/
Right now, we don't have any GUI to easily show you all the various things stored on your VM disk, so it's hard to tell what exactly your system has dumped on your disk. This is something we are exploring and may come in future (in particular, listing your local images and being able to delete them from the GUI has been discussed a number of times). However, if you are capable with a terminal, it's possible to investigate right now by logging into the VM and exploring the file system like any other linux file system.
An image we often use to do this directly is docker run --privileged --pid=host -it justincormack/nsenter1
you can then use du -hs
on various directories to find out whats taking up all the space. (This comment https://github.com/docker/for-win/issues/1042#issuecomment-485698417 is a similar thing)
e.g.
/ # du -hs /var/lib/docker
860.0K /var/lib/docker
(fresh install)
/ # du -hs /var/lib/docker
70.9M /var/lib/docker
after pulling ubuntu
It's also possible there is a bug in docker taking up all the hard drive space, but given the high number of people (understandably) struggling with disk space management it's hard to conclude there is any bug here. If anyone has gone through their drive in detail as above and still thinks there is a bug causing the disk to fill up, please let me know.
@cjancsar I don't seem to have this issue at all. What virtual machine are you using? Is it Hypervisor or VirtualBox?
Thanks for the response @mikeparker, we really appreciate your time.
I am really motivated to get to the bottom of this issue, as it has been crippling our teams productivity the passed few weeks and is extremely frustrating. I just want to be able to program in peace :( It really seems to be potentially more than just a simple running out of space issue (based on the numerous highly thumbed up github issues--and if it is a running out of space issue, it is a very bad user experience at the moment as... I have a 2 TB HD why can't I just allocate all of that space to make the problem go away?).
Can you please examine the diagnostics (CE47E764-D69E-4341-8037-4D9F978ECF6A/20200122200602) that we included to confirm that indeed the VM drive was at 100% capacity when the diagnostic was generated? As I have no way to interpret those files, and have extremely limited information / expertise to be able to reach a conclusion on the issue. Because the docker desktop UI always shows significant remaining capacity when this issue is encountered and doing a docker system df
also shows alot of space remaining. Below is an image of the VM drive space when the issue is occurring:
When we encounter this issue every other day this is the typical workflow that recreates it:
docker system prune --force --all --volumes
to clear the system.docker-compose build
or docker-compose up --build
(which take from 40 minutes to 2 hours or more)docker-compose down
followed by a docker-compose build [THAT-UPDATED-SERVICE-NAME]
, then we bring the stack up again with docker-compose up
docker-compose down
docker-compose build [SOME-SERVICE]
docker-compose up
... we do some work
docker-compose down
docker-compose build [SOME-SERVICE]
docker-compose up -> FAILS: No space left on device error
... we get sad
nuclear option rebuild everything again -> 1 hour of waiting ...
What could be the cause of this? The majority of the images are unchanged and are just going down and up. We are just using docker and docker-compose in what we assume is a normal way, but this normal way (for just 10 or so services) is causing extreme frustration. Are we using docker wrong? Are we not supposed to bring services down and up frequently? Are these down'd and upp'ed containers being orphaned or duplicated somehow?
We would also be open to contracting some time for someone with signification experience (like yourself @mikeparker ) to briefly review our stack and workflow to see if it is a usage issue--let me know if there is expertise available for that.
@cjancsar thanks for the detailed response.
Your diagnostic does indeed show 19.4GB of space left. I have chatted with one of the other engineers and there is another possibility: that your system has run out of inodes. This will likely result in the same error message.
To confirm if this is the case can you please run the following command and paste the output?
docker run -it --privileged --pid=host justincormack/nsenter1 /bin/df -i /var/lib
If this is indeed the problem, this is due to too many small files on disk, possibly copying lots of small files into images then not deleting them after building.
Interestingly, removing images will fix the problem so its possible other users also have the same issue and we have incorrectly diagnosed this as 'images taking up all the space' when in fact the issue could be 'files inside images taking up all the inodes'.
That said, you probably don't want to remove your base images because as you've seen, this will mean redownloading and rebuilding unnecessarily. You want to prune the images with your source code in but nothing earlier in your build pipeline.
With many languages the best practice is to either delete all your source code after compiling, or copy the build artifacts into a new image, ensuring the source code doesnt get copied into the resulting image. You can then delete the source code image, freeing up space and inodes.
@mikeparker yes, it looks like I am indeed out of inodes when this issue occurs. I have seen this mentioned in a few of the SO issues I read, but don't know how to convert this into a resolution or find the reason that we are hitting the inode limit when others developers do not:
I have, however, had a suspicion that the way in which we do our hot reloading volume mounting contributes to this issue (somehow). Our anonymous volumes don't seem to get cleaned up between up
/ down
cycles, but I don't have the expertise to dig deeper into this. In fact, when I did a docker volume prune
the inode count returned to 35% capacity and I was able to re-up containers successfully
After docker volume prune
operation:
Then, after re-up
of containers:
down
command again:
up
command again:
down
command again:
up
command again (hit inode limit):
And another round of down
/ up
, and we encounter the error:
And again, can lower inode count with docker volume prune
:
When I disable our volume mounting strategy the inode capacity stays consistently at 35% and we do not encounter the dangling anonymous volumes after each
up
/down
cycle, and consequently we do not encounter theno space left on device
error. However, we need to be able to do hot reloading of our packages so that we can develop in the multi-service environment and watch our changes.
Thoughts:
NodeJs
services so I am anticipating that the node_modules
folders are the culprit, and, again that would be necessary for the watch
/ build
hot-reloading cycle.)justincormack
(from Docker) about this, but again, it would be more of a band-aid and wouldn't solve root issue:
docker
and docker-compose
cli commands, but, is there a way to ensure that dangling anonymous volumes are garbage collected when the containers they are attached to are down
d, as what it the point in retaining them, can they somehow be re-attached when you up
the same containers?As reference, this is the way in which we share our volumes:
some-service:
container_name: some-service
...... other stuff
volumes:
- .:/app
- /app/node_modules
Btw as an aside @cjancsar you can re-size the disk on your Docker host. As I noted above, you can use the Advanced settings in the GUI to increase the disk size. So that "Maximum Disk Size" is actually configurable. You could throw more of your 2TB disk at it.
Expected behavior
Be able to use docker for windows more than one week after the installation.
Actual behavior
I cannot start/build anything since I always have the "No space left on device" message. It seems like my MobiLinux disk is full (60/60 gig used)
Information
I have already run
docker system prune -all
and I tried the commands in #600. Now I have no image and no container left. But I still cannot do anything.Steps to reproduce the behavior
Use docker on windows and build a lot of images. It tooks me less than a week since I installed docker for Windows. My images can be heavy: between 500mb and 2gb and I built a lot of them the last week (maybe 50 to 100). This could match the 60go of the MobiLinux VM.