Closed Emporea closed 1 year ago
Also seeing this issue. We had to revert to 20.10 because it was filling up disks with no way to recover and causing outages.
Also seeing this issue. We had to revert to 20.10 because it was filling up disks with no way to recover and causing outages.
It seems like it's this change in 23 (mentioned in the 23 changelog). When I run this all=true
filter on docker volume prune
it works. But docker system prune
does not accept this filter, so now seems broken
Not clear why this default needed to change
Yes this sounds like it is related to the mentioned change. The default change allows us to:
docker volume prune
safer to execute. Given that named volumes typically are named so they can be easily referencedIt does mean that upgrading causes volumes which were created prior to 23.0.0 to not be considered for pruning except for when specifying --all
.
However "anonymous" volumes created after 23.0.0 will be considered for pruning... and of course again --all
gives the old behavior.
Also if your client uses the older API version it will also get the old behavior.
Do you suggest to delete everything and recreate every volume ( obv. after backing up to restore after they have been recreated) to get rid of old obsolete configs?
@Emporea docker volume prune --all
should give the old behavior. I understand docker system prune
doesn't support this yet (not intentionally).
Do you suggest to delete everything and recreate every volume
No, I don't think that should be necessary, it would likely be simpler to use docker volume prune --filter all=1
in addition to docker system prune
until your older volumes are no longer in use.
You can also use DOCKER_API_VERSION=1.40 docker system prune --volumes
to get the old behavior.
root@server ~ [125]# docker volume prune --all
unknown flag: --all
See 'docker volume prune --help'.
root@server ~ [125]# docker --version
Docker version 23.0.0, build e92dd87
Sorry I gave the wrong command in the first line there, docker volume prune --filter all=1
Thank you. this works and helps me for now.
The default change allows us to:
1. Differentiate an anonymous volume from a named one 2. Make `docker volume prune` safer to execute. Given that named volumes typically are named so they can be easily referenced
I guess that's fine reasoning, but the execution of this change was very strange for multiple reasons:
-h
, and hopefully notice mention of named vs anonymous volumes.The CLI help output doesn't explain anything about these distinctions. (Still says "Remove all unused local volumes", nothing about anonymous vs named)
Yeah, the help needs to change to say 'anonymous.'
The documentation does not explain anything about these distinctions. (Same as above). We had to look in a specific change log to find the information.
Same as the help output, PRs welcome.
Seems like when you change something like this, "principal of least surprise" should apply. I don't care about anonymous volumes vs named volumes, for one.
I'd like to point out you're a tiny minority there -- for the vast majority of users, "Docker deleted my data after I ran system prune -a
" has been a sharp edge for years. Most users expect prune
to 'clean up garbage,' not 'clean up the things I wanted Docker to keep.'
As mentioned (and acknowledged by you), this change was not actually propagated to other parts of the CLI.
The only part where this possibly needs to propagate is docker system prune -a
and we're still not sure what the better behavior is.
That way, if a Docker user sees a volume isn't getting cleaned up, they can run -h, and hopefully notice mention of named vs anonymous volumes.
Agreed, there should be an example of using the all filter in the help text.
Please keep in mind that this has been a persistent pain for educators, commercial support, and undercaffinated experts for years. People are here in this thread because they find the behavior change surprising, and yeah, it looks like review missed the docs updates needed (and this is because of the historical incorrect split of client/server logic we are still cleaning up) -- however, please keep in mind that this thread represents the minority of users who find this behavior unexpected or problematic.
We certainly can improve here, and there are a lot of valid points, but the happy path for the majority of users is to stop pruning named volumes by default.
Also, as an aside: the behavior here changed in the daemon (and is dependent on API version) -- an older CLI against a current daemon will see the old behavior, and the new CLI against an older daemon will also see the old behavior.
So as we look at improving the docs & help output, we need to figure out how to explain that difference coherently.
This issue is not directly related to Docker Desktop for Linux; probably would've been best in the https://github.com/moby/moby issue tracker, but I can't move it there because that's in a different org.
Let me move this to the docker/cli repository, where another thread was started in https://github.com/docker/cli/issues/4015
Hi,
For some reason anonymous volumes are not deleted by prune on our systems since the change:
[root@stuff~]# docker volume ls
DRIVER VOLUME NAME
[root@stuff~]# cat asd.yml
version: '3'
services:
redis_test:
image: redis:alpine
mysql_test:
image: mysql:8
environment:
MYSQL_ROOT_PASSWORD: test
[root@stuff~]# docker-compose -f asd.yml up -d
Creating network "root_default" with the default driver
Creating root_redis_test_1 ... done
Creating root_mysql_test_1 ... done
[root@stuff~]# docker-compose -f asd.yml down
Stopping root_mysql_test_1 ... done
Stopping root_redis_test_1 ... done
Removing root_mysql_test_1 ... done
Removing root_redis_test_1 ... done
Removing network root_default
[root@stuff~]# docker volume ls
DRIVER VOLUME NAME
local 6cb48bf0d12f5f9ec6ed0fe4a881a88690d17990ebc43acf16b5266b2a2cc7c3
local 6209ed575c411062992e3ea3e66ba496735346945602ff2a02a31566b2d381ed
[root@stuff~]# docker system prune -af --volumes
Deleted Images:
untagged: mysql:8
untagged: mysql@sha256:c7788fdc4c04a64bf02de3541656669b05884146cb3995aa64fa4111932bec0f
deleted: sha256:db2b37ec6181ee1f367363432f841bf3819d4a9f61d26e42ac16e5bd7ff2ec18
[...]
untagged: redis:alpine
untagged: redis@sha256:b7cb70118c9729f8dc019187a4411980418a87e6a837f4846e87130df379e2c8
deleted: sha256:1690b63e207f6651429bebd716ace700be29d0110a0cfefff5038bb2a7fb6fc7
[...]
Total reclaimed space: 577.6MB
[root@stuff~]# docker volume ls
DRIVER VOLUME NAME
local 6cb48bf0d12f5f9ec6ed0fe4a881a88690d17990ebc43acf16b5266b2a2cc7c3
local 6209ed575c411062992e3ea3e66ba496735346945602ff2a02a31566b2d381ed
[root@stuff~]# docker volume prune
WARNING! This will remove all local volumes not used by at least one container.
Are you sure you want to continue? [y/N] y
Total reclaimed space: 0B
[root@stuff~]# docker volume ls
DRIVER VOLUME NAME
local 6cb48bf0d12f5f9ec6ed0fe4a881a88690d17990ebc43acf16b5266b2a2cc7c3
local 6209ed575c411062992e3ea3e66ba496735346945602ff2a02a31566b2d381ed
Doesnt matter what kind of prune I run, the only one that works is the above mentioned filter one (that removes named volumes).
What happens if you manually try to remove it?
What is the result of docker inspect <volume>
?
Also keep in mind, prune will not prune volumes created before the upgrade (unless you set --filter all=1
) since there was no way to know if a volume is an "anonymous" volume or not.
Hi,
I just created these volumes with docker-compose (as you can see at the first line, no volume there). The system is up-to-date, latest docker-ce. Here is inspect output:
[root@stuff~]# docker inspect 1b957379dddae8abc21c3f469c966d48d83ecd54cd389c89bda0324739b18653
[
{
"CreatedAt": "2023-03-11T13:49:09+01:00",
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/1b957379dddae8abc21c3f469c966d48d83ecd54cd389c89bda0324739b18653/_data",
"Name": "1b957379dddae8abc21c3f469c966d48d83ecd54cd389c89bda0324739b18653",
"Options": null,
"Scope": "local"
}
]
docker-compose
is likely using an older API. Try creating the volumes from docker compose
(space rather than dash) instead.
@sudo-bmitch I'm getting the same behavior as described by @Re4zOon with docker compose
.
Edit: With plain docker
too. I can reproduce with docker run -e MARIADB_ROOT_PASSWORD=root mariadb
for example. If I stop and remove the container, the volume stays and prune won't delete it, unless I specify --filter all=1
.
$ docker volume inspect b8ef83d3ed925b7d92cfac80a095b047f20e8487749d3a30cd6700c17318ff95
[
{
"CreatedAt": "2023-03-11T13:54:58-08:00",
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/b8ef83d3ed925b7d92cfac80a095b047f20e8487749d3a30cd6700c17318ff95/_data",
"Name": "b8ef83d3ed925b7d92cfac80a095b047f20e8487749d3a30cd6700c17318ff95",
"Options": null,
"Scope": "local"
}
]
$ docker volume prune -f
Total reclaimed space: 0B
$ docker volume prune -f --filter all=1
Deleted Volumes:
b8ef83d3ed925b7d92cfac80a095b047f20e8487749d3a30cd6700c17318ff95
Total reclaimed space: 156.5MB
Output of docker system info
:
Client:
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: 0.10.3
Path: /usr/lib/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: 2.16.0
Path: /usr/lib/docker/cli-plugins/docker-compose
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 49
Server Version: 23.0.1
Storage Driver: btrfs
Btrfs:
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 1e1ea6e986c6c86565bc33d52e34b81b3e2bc71f.m
runc version:
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.2.2-zen1-1-zen
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 20
Total Memory: 31.03GiB
Name: xenomorph
ID: M6ZE:N2VB:3P6Z:7V55:H6W7:KQJA:QESD:MUPC:T762:4ENW:KUUX:2WU3
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
I assume the issue is with the images having built-in volume declarations: (10th layer) https://hub.docker.com/layers/library/redis/latest/images/sha256-f6e3da94f24dabc9569204c774cd7d11ab6eaa3865b108ae07694161251c854c?context=explore (14th layer) https://hub.docker.com/layers/library/mysql/latest/images/sha256-79866c649987750de41276796f7b29a54be80834dd2bc20e435dc9554a33945f?context=explore
Thanks for the reports. I found the bug and will post a patch momentarily.
https://github.com/moby/moby/pull/45147 should fix these cases.
--- edit --
To clarify, it should fix the case where a volume is created from the image config.
I have spent days trying to figure out why all of our gitlab runners are suddenly out of disk space, resorting to crude methods of stopping all containers manually in the middle of the night so I can loop through volumes and delete them. I expect a command like docker volume prune -f
to Remove all unused local volumes
as the docs say and have said for years (the example usage on the docs page even explicitly has a my-named-vol
being removed).
Regardless of doc updates being missed (we're humans, it happens), a change with this big of an impact should have had deprecation/compatibility warnings for at least one version. "My logs are being polluted, what's going on? Oh, I need to update a command, cool." is a much easier problem to deal with than "Why is the entire fleet of servers all running out of disk at the same time?!"
The changelog has this listed under "bug fixes and enhancements" and has the prefix of API:
with no mention whatsoever of this effecting the cli. Even the most vigilant changelog readers would have missed this without intimate knowledge of the cli/API interaction.
I just use docker volume rm $(docker volume ls -q)
since I can't remember the filter options.
I just use
docker volume rm $(docker volume ls -q)
since I can't remember the filter options.
~I believe that's different, since it also deletes volumes attached to existing containers, doesn't it? Fine if your use case allows for that, but probably not ideal for most people.~
Edit: Ah, docker volume rm
doesn't allow the removal of in-use volumes.
Anyway, I did make a PR to at least update the docs: https://github.com/docker/cli/pull/4079
It seems like both @7E6D4309-D2A9-4155-9D4F-287B8CDA14C1 were bit by this in the exact same use case, and I'd love it if no one else in the future had to suffer in the same way.
Same here, cannot delete volumes as part of me running,
docker compose down && docker system prune -af --volumes && docker compose up -d --build
@rahulkp220 Please make sure you are on the latest builds of all the things. Either 23.0.6 or 24.0.0.
If you are still having issues on these versions, please give exact details on how to reproduce with output of docker info
and docker version
.
Hmm, here is the info.
docker info
Client:
Context: desktop-linux
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.10.4
Path: /Users/milo/.docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.17.3
Path: /Users/milo/.docker/cli-plugins/docker-compose
dev: Docker Dev Environments (Docker Inc.)
Version: v0.1.0
Path: /Users/milo/.docker/cli-plugins/docker-dev
extension: Manages Docker extensions (Docker Inc.)
Version: v0.2.19
Path: /Users/milo/.docker/cli-plugins/docker-extension
init: Creates Docker-related starter files for your project (Docker Inc.)
Version: v0.1.0-beta.4
Path: /Users/milo/.docker/cli-plugins/docker-init
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
Version: 0.6.0
Path: /Users/milo/.docker/cli-plugins/docker-sbom
scan: Docker Scan (Docker Inc.)
Version: v0.26.0
Path: /Users/milo/.docker/cli-plugins/docker-scan
scout: Command line tool for Docker Scout (Docker Inc.)
Version: v0.10.0
Path: /Users/milo/.docker/cli-plugins/docker-scout
Server:
Containers: 5
Running: 2
Paused: 0
Stopped: 3
Images: 5
Server Version: 23.0.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 2806fc1057397dbaeefbea0e4e17bddfbd388f38
runc version: v1.1.5-0-gf19387a
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 5.15.49-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: aarch64
CPUs: 5
Total Memory: 7.667GiB
Name: docker-desktop
ID: 585c4ded-3e8b-4064-bcb7-6fd5547e553b
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5555
127.0.0.0/8
Live Restore Enabled: false
and docker version
docker version
Client:
Cloud integration: v1.0.31
Version: 23.0.5
API version: 1.42
Go version: go1.19.8
Git commit: bc4487a
Built: Wed Apr 26 16:12:52 2023
OS/Arch: darwin/arm64
Context: desktop-linux
Server: Docker Desktop 4.19.0 (106363)
Engine:
Version: 23.0.5
API version: 1.42 (minimum version 1.12)
Go version: go1.19.8
Git commit: 94d3ad6
Built: Wed Apr 26 16:17:14 2023
OS/Arch: linux/arm64
Experimental: false
containerd:
Version: 1.6.20
GitCommit: 2806fc1057397dbaeefbea0e4e17bddfbd388f38
runc:
Version: 1.1.5
GitCommit: v1.1.5-0-gf19387a
docker-init:
Version: 0.19.0
GitCommit: de40ad0
What's the best way to upgrade cli?
Note that the fix in https://github.com/docker/cli/pull/4229 added a --all
option to docker volume prune
, but did (afaik) not change the behaviour of docker system prune
to remove named volumes.
Does it work if you would do a docker volume prune -af
?
We have not changed the behavior of system prune -a
to imply volume prune -a
.
Hi!
I've no clear why using docker system prune -a --volumes
the volumes aren't deleted. I'm using Docker version 24.0.1, build-6802122 on Ubuntu 22.04.
If I remove manually the volumes, I can't rebuild the related containers because I see the error: Error response from daemon: open /var/lib/docker/volumes/project_volume/_data: no such file or directory
.
Using docker volume prune --all
I can rebuild the containers.
As stated in this thread, system prune -a
no longer prunes named volumes, by design. If you need to prune named volumes, the method to use is currently volune prune -a
.
system prune -a
is a command often fired indiscriminately, and has lead to much data loss and gnashing of teeth. While having to run two commands is a mild pain, it helps with preventing frustration and loss of data for new users copying commands out of tutorials.
We can certainly explore a system prune --all=with-named-volumes
or something in the future for users who understand exactly what they are doing, but currently the need to run a separate command is by design.
Currently I am still using docker volume prune --filter all=1
as a "workaround" to make sure everything is deleted.
Is this still necessary or should volume prune -a
do the same? (It did not do the same when I opened this issue)
Thank you @neersighted for the explanation
volume prune -a
is a convenience alias added in 23.0.5 for volume prune --filter all=1
.
volume prune -a
is a convenience alias added in 23.0.5 forvolume prune --filter all=1
.
Yes, the "filter" is effectively the internal implementation used to prune named volumes. We didn't want that implementation detail to leak into the UX (but missed adding the "porcelain" --all
/ -a
option in the initial 23.0 release). Using docker volume prune --all
/ docker volume prune -a
is the recommended way to do this (for current versions of docker 23.0.x and newer).
Even on 24.0.2, the output of docker system prune --volumes -a
still says it will remove "all volumes not used by at least one container," which is incorrect. Thus leading me to have to search for answers online to figure out why my orphaned volumes weren't getting pruned. The only command output I could find that actually mentions any sort of distinction between anonymous and named volumes is docker volume prune --help
. Every other mention of pruning volumes in help text or command output just says "all volumes" instead of "anonymous volumes."
https://github.com/docker/cli/pull/4079 addresses the help/docs issue; please give it a review (or even a local test) and let us know if it covers all the cases you have in mind.
It would be nice if the change would be cascaded into system prune
and system df
as raised by others.
prune
still claims it removes all volumes not linked to any containers if passed --volumes
.
df
, similarly list those volumes as reclaimable space, which should now be taken with a grain of salt.
https://github.com/docker/cli/pull/4497 was accepted and cherry-picked, which addresses the docs/--help
issue. It is intentional that system prune -a
no longer affects named volumes; I think a --really-prune-everything
flag is out of scope for this issue, but feel free to open a feature request if you think that it is useful in the 90% case. My earlier comments are still quite relevant, I think:
system prune -a is a command often fired indiscriminately, and has lead to much data loss and gnashing of teeth. While having to run two commands is a mild pain, it helps with preventing frustration and loss of data for new users copying commands out of tutorials. We can certainly explore a system prune --all=with-named-volumes or something in the future for users who understand exactly what they are doing, but currently the need to run a separate command is by design
I'm going to close this for now, as the last set of sharp edges identified here are solved (and will be in the next patch release), but please feel free to continue the discussion or open that follow-up feature request.
@neersighted just to be precise though, from what I know the -a
, --all
in system prune affect only images. For volumes, there is another option/flag, which is --volumes
.
Thinking about it, I could definitely see a --volumes
that would default to --volumes anonymous
and a --volumes all
allowing to cascade the need to delete named volume to the docker volume prune
command, here.
But I also see the place where this "better safe than sorry" change is coming.
@neersighted this is the worst update in docker history.
We've just spent 3h debugging duplicate keys in DB because for some reason docker system prune --volumes
suddenly stopped pruning the volumes.
I guess some people copy rm -rf /
from stackoverflow and get an unpleasant suprise but if the commad is basically named docker remove everything please
and asks you to type y
if you're sure, you deserve to type your 5 SQL statements again.
It is also a pretty momorable lesson to not host production persistent stores in a container.
I spent half a day today on this, I expected that my volumes were being deleted... thank you for changing the behavior and not sending a message to the console that it no longer works as before (without negativity)
Every time I have to type two commands to test my deployment I think about this thread 🔥
Most annoying bug ever! At least mark it deprecated or something.
I'm trying to wrap my head around what's going on. Did this behavior get rolled back?
In the environments I manage, there is a machine where our old script seems to be working as originally desired, and I noticed it is running version 24.0.2
/ API version 1.43
. Of course, the nodes with 23 / API 1.42 still exhibit the need to provide --filter all=true
.
I'm not seeing anything that specifically mentions this issue in the Engine 24 or API 1.43 release notes.
Seems like things may have got really whacked out, or I am whacked out trying to interpret the docs.
The 1.42 API (as reported upthread here) only considers anonymous volumes for deletion on a docker volume prune
. To my understanding, this applies to volumes where the first param is omitted from -v
, so it is no surprise that we are getting a lot of dangling volumes left behind. It also seems that the warning message was left unchanged:
root@fooHostnameDocker23:~# docker version | grep 'API version'
API version: 1.42
API version: 1.42 (minimum version 1.12)
root@fooHostnameDocker23:~# docker volume prune
WARNING! This will remove all local volumes not used by at least one container.
Are you sure you want to continue? [y/N]
On the node with Docker 24/API 1.43, the warning message has been updated to match the behavior of the change rolled out in Docker 23/API 1.42; however, the actual behavior seems to have been rolled back and actually prunes our named volumes as we've desired all along:
root@fooHostnameDocker24:~# docker version | grep API
API version: 1.43
API version: 1.43 (minimum version 1.12)
root@fooHostnameDocker24:~# docker volume prune
WARNING! This will remove anonymous local volumes not used by at least one container.
Are you sure you want to continue? [y/N]
I'm just illustrating the warning message, not the pruning output. I am fully certain that our Docker 24 node still works as we'd like when typing docker volume prune
to purge our named volumes, despite the purported changes.
I did also manage to find another mention of --filter all=true
not being needed after 23 over at: https://github.com/docker/cli/pull/4218#issuecomment-1516977411
So, is this a bug? Is the "backport" referring to the -a
option? Rolling back the behavior? Something else?
😵💫
I'm really sorry if I'm missing something here.
@boneitis It definitely has not changed.
Note: The daemon cannot protect volumes that were created before version 23.
Thank you @cpuguy83 for the response; the note is helpful to know.
However, our deployments, including our machine with Docker 24, fires up around a dozen containers over the course of a day.
All of the what-I-understand-to-be-a-variant-of-"named" volumes get purged (on the version 24 node) without the filter-all parameter, as they did pre-23, whereas the machines with Docker 23 still require it.
That is, the mounts of which "Type" is volume
, "Name" is the 64-character hex string, and the "Source" is /var/lib/docker/volumes/<hex string>/_data
are the ones that are left dangling, counter to our intended behavior (on version 23). I understand (maybe incorrectly?) that these are named volumes, which are not expected to be purged on 24?
If its a hex string then it is most likely not a named volume (unless you have something generating strings when you create the volume).
You should be able to inspect the volume and check for the label: com.docker.volume.anonymous
.
If this key exists (value is unused) then it may be collected by docker volume prune
.
Please, I just need a bash script to delete everything. No dangling images, "volumes not used by at least one container", or whatever. Everything. What sequence of commands do I need to run to reset Docker to a completely clean slate?
Since the update to docker 23 unused volumes will not be deleted anymore with
docker volume prune
nordocker system prune --volumes
.The answer is always
Total reclaimed space: 0B
. When I delete the volumes manually I even get these warning when running docker-compose.ymlError response from daemon: open /var/lib/docker/volumes/docker_volume1/_data: no such file or directory
Whats happening?