moby / buildkit

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
https://github.com/moby/moby/issues/34227
Apache License 2.0
8.24k stars 1.17k forks source link

docker buildx prune --filter="until=xyz" marks unrelated cache layers as "last used" and does not delete parents #5436

Open dimikot opened 1 month ago

dimikot commented 1 month ago

Description

When running e.g.

docker buildx prune --filter="until=25s" 

after deleting some cache layers, the parents of those cache layers are marked as "last used". This does not let it prune the entire subtree: instead, it prunes only one leaf layer at a time.

Reproduce

This is actually very hard to reproduce, so I provide a screenshot from some real CI run. I just built a quick python tool which represents the results of du and prune as a tree and adds colors.

I run du. Look at cache id=byc4z0pb2ba29tm25nqbrdcpk (underlined with red) and its parent lac46mmewlr8bqd5f7ii95hgd (underlined with green). They were both last used 3 minutes ago.

Then, docker buildx prune --filter="until=25s" removes the old unreferenced caches, and it removes the red cache byc4z0pb2ba29tm25nqbrdcpk (which is correct). For some reason, it doesn't remove its green parent lac46mmewlr8bqd5f7ii95hgd (although it theoretically should).

And after pruning, I run du again, and look what happened with the green parent lac46mmewlr8bqd5f7ii95hgd (follow the arrows): it is now "Last used 1 second ago"! (Just reminding that, before pruning, it was "Last used 3 minutes ago".) I.e. prune does update the timestamp of the cache it doesn't touch. I think it may also be the reason why it doesn't delete that green parent: since it touches it, it doesn't treat it as "older than 25s".

Image

Expected behavior

  1. On that screenshot, both caches (red byc4z0pb2ba29tm25nqbrdcpk and its green parent lac46mmewlr8bqd5f7ii95hgd) should've been pruned, because they both are older-used than 25s ago. But it pruned only the leaf cache.
  2. Or at least, green parent's lac46mmewlr8bqd5f7ii95hgd timestamp should not be modified at pruning for sure.

docker version

Client: Docker Engine - Community
 Version:           27.3.1
 API version:       1.47
 Go version:        go1.22.7
 Git commit:        ce12230
 Built:             Fri Sep 20 11:41:08 2024
 OS/Arch:           linux/arm64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          27.3.1
  API version:      1.47 (minimum version 1.24)
  Go version:       go1.22.7
  Git commit:       41ca978
  Built:            Fri Sep 20 11:41:08 2024
  OS/Arch:          linux/arm64
  Experimental:     false
 containerd:
  Version:          1.7.22
  GitCommit:        7f7fdf5fed64eb6a7caf99b3e12efcf9d60e311c
 runc:
  Version:          1.1.14
  GitCommit:        v1.1.14-0-g2c9f560
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client: Docker Engine - Community
 Version:    27.3.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.17.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.29.7
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 1
  Running: 0
  Paused: 0
  Stopped: 1
 Images: 1
 Server Version: 27.3.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7f7fdf5fed64eb6a7caf99b3e12efcf9d60e311c
 runc version: v1.1.14-0-g2c9f560
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.8.0-1016-aws
 Operating System: Ubuntu 22.04.5 LTS
 OSType: linux
 Architecture: aarch64
 CPUs: 16
 Total Memory: 30.75GiB
 Name: bc09d8dcc01d
 ID: 7c8981bc-d183-47c2-8e22-03ad2c38f85e
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

Additional Info

What I'm trying to achieve with all these is to remain only the layer caches related to the latest build, and prune everything else. I.e. remain only the artifacts of the latest, most recent build. Theoretically, docker buildx prune --filter="until=${until}s" should do it (where until = now() - build_start_timestamp), and in fact it seems to do so on e.g. MacOS (docker 27.2.0) with my test Dockerfile. But in practice, probably due to the effect explained above (marking unrelated caches as "recently used" on Linux and with a real heavy Dockerfile), it doesn't work as expected.

I also tried to downgrade to 27.2.0 in Linux (both docker-ce and docker-ce-cli), it didn't help, same effect.

dimikot commented 1 month ago

@tonistiigi maybe you have some hints on why is this happening? I tried to look at the source code, at the places that call updateLastUsed(); I found out that it's called in release(). And release() is called from many places... Maybe cacheManager.prune() in cache/manager.go does it unintentionally somewhere?

thaJeztah commented 1 month ago

I also tried to downgrade to 27.2.0 in Linux (both docker-ce and docker-ce-cli), it didn't help, same effect.

From your output, I see --builder=container in your output, which means that you're likely using a custom containerised builder (created through docker buildx create). In that case buildx is not using the BuildKit instance that's compiled into the Docker Engine, but it's using a fully separate BuildKit daemon that runs inside a container. Downgrading Docker in that case likely won't downgrade BuildKit (as it's separate). You can use docker buildx inspect --builder=container to get information about the version of BuildKit running in the container.

Very orthogonal to this ticket, but if the reason you're running a separate builder is to build multi-arch/multi-platform images, and if you have an environment to test on, then it's worth considering to enable the containerd image store;

With the containerd image store enabled, the Docker Engine can store multi-platform images, and data used for build-cache and images is shared between BuildKit and the Docker Engine, which can save storage, as well as improve performance.

thaJeztah commented 1 month ago

In either case, this looks to be an issue related to BuildKit, so let me transfer it to the BuildKit issue tracker

dimikot commented 1 month ago

@thaJeztah

if the reason you're running a separate builder is to build multi-arch/multi-platform images

Not only for that. Also to be able to build on dev Macbooks without forcing all devs to manually change their docker-desktop configs (if I understand how it all works correctly). Plus to have customizable gc settings in buildkitd.toml (again, without any needs for anyone to tweak their docker-desktop). Plus there is another reason: builder, when running as a separate container, stores all its caches in a volume, and we have a fs-based infra in CI which can backup and restore volumes of arbitrary sizes blazingly fast, like 10-20 times faster than --cache-to/--cache-from could even imagine. But it's unrelated to the ticket I think.

The builder container is created with:

docker buildx create --name container
  --driver=docker-container
  --buildkitd-config=buildkitd.toml
  --bootstrap

Also, I've just published an open-source tool https://github.com/dimikot/docker-buildx-cache/ which works-around this behavior, plus can print cache layers in a hierarchical way.

dimikot commented 2 weeks ago

Sorry, accidentally closed. Reopening.