Open k1w1m8 opened 8 months ago
If you have a look into the buildkit cache, there doesn't seem to be a lot of controls over which images to keep and which to prune.
You can filter by property, age or storage size. Size is the most automatic way, letting the builder decide the cache that is least likely to be needed.
If you want to use docker images
for pinning, then you need to have something that calls docker pull
to pull a specific set of images to docker's image store. This would then be not released by prune without --all
and it is your responsibility to manually delete them when needed. The deletion criteria has not been changed. Previous side-effects of the old builder have been fixed and it creates images per user request and as defined by the -t
build flag. Pinning the images with docker pull
does not duplicate their storage size if the same layers are also used in some builder cache.
If you have a look into the buildkit cache, there doesn't seem to be a lot of controls over which images to keep and which to prune.
You can filter by property, age or storage size. Size is the most automatic way, letting the builder decide the cache that is least likely to be needed.
If you want to use
docker images
for pinning, then you need to have something that callsdocker pull
to pull a specific set of images to docker's image store. This would then be not released by prune without--all
and it is your responsibility to manually delete them when needed. The deletion criteria has not been changed. Previous side-effects of the old builder have been fixed and it creates images per user request and as defined by the-t
build flag. Pinning the images withdocker pull
does not duplicate their storage size if the same layers are also used in some builder cache.
That might be smth we could clarify in docs (cc @dvdksn)
The issue here is that I don't have direct control over docker pull as I am using compose, using a mixture of image: tags and Dockerfile FROMs.
@ndeloof from compose team has said they need input from buildx how to solve this (see https://github.com/docker/compose/issues/11346).
If compose wants to add something like pin
that instead of building would download the images, we could theoretically provide something that gets the image names from dockerfile so compose doesn't need to parse it itself. It is tricky tough as there are cases where base images can not be determined without running at least a partial build.
For getting these images after build, this is already possible with --metadata-file
option in buildx today that captures all the image sources, so that list could be then used for pinning or other storage actions. It is worth noting that the format of this output might change in the future as we are in the process of normalizing to SLSA spec types.
Thanks @tonistiigi, @crazy-max, @ndeloof
Seems the issue is in the details of the interface between compose and buildx teams.
How best to progress such issues which require intra team co-ordination and possibly coordinated changes on both ends?
IMHO there's some misunderstanding here. Compose does not control the build, it just configure buildx (vendored) to build required images. The fact a non-default builder won't store the base image in moby image store, and as such those must be (re)downloaded is not (yet) a build option we can enable
@ndeloof I'm not sure what the misunderstand is that you refer to.
To me this is a docker internal issue where two parts of docker need to work together to fix a regression related to the purging of downloaded images vs built images.
@k1w1m8 that's what I mean: this isn't a docker compose feature, never has been. You just expect the builder to store base image in engine local cache, this isn't something exposed by the builder API we can manage.
@ndeloof I understand that.
However, think of it from the perspective of the "customer". I'm a user of docker compose. Compose stopped doing what it used to do, due to changes "under the hood" of compose. which I am only vaguely aware of.
Should I then look into the internal implementation of compose to determine where precisely the issue is located and raise separate tickets on the internal compose components in the hope that fixes will eventually show up in my usage of compose?
What if there are API changes between compose and the internal components required? Should I negotiate those API changes with the two teams?
Or should the compose team talk to the internal component teams in order to decide what API changes are required to surface the correct behaviour to the compose customer?
@k1w1m8 this is not specific to compose, you have the same issue with a plain docker build..
. This is really internal to the builder
@ndeloof I take your point, perhaps I should have created this ticket on moby rather than buildx?
However this issue is not purely about functionality from docker eg docker build and Dockerfiles. There is also the image: tags inside the compose files. For the new compose to be consistent with older compose, we'd like control of the downloaded base images to be consistent no matter if done via compose image tags or via building Dockerfiles.
Which component of docker is responsible for compose image tags?
There is also the image: tags inside the compose files. (...) we'd like control of the downloaded base images to be consistent no matter if done via compose image tags or via building Dockerfiles.
I don't get it here, the base image is only declared in your Dockerfile using FROM
instruction. tags
set in compose is for the image being built (same as docker build --tag
), can you please clarify your point ? Maybe a minimal example would help
@ndeloof
Ah yes, I can see where I went wrong explaining that. Sigh.
A compose project can contain a direct image reference in the yaml, or it can contain FROM in a Dockerfile to be built.
https://docs.docker.com/compose/compose-file/compose-file-v3/#image
Example:
services:
mycontainer:
image: myrepo.com/myimage:1.2.3
So therefore any changes to the way that downloaded images are stored would need to be consistent between
image:
processing, via compose?Or are you saying that buildx does the downloading for both these scenarios?
image download during the Dockerfile build process, via buildkit, are store in the builder - when using the one backed by docker engine, this is actually the same as docker engine image store
compose image download during the image:
are the same as docker pull
, and are stored in docker engine image store.
the fact your Dockerfile's base image is tagged and get available in docker image store is a side-effect of using engine builder, if you run build with an alternate builder this won't be the case.
If you want to enforce base image to be pulled, you can use this compose hack:
services:
base-image:
scale: 0
image: myrepo.com/myimage:1.2.3
service won't actually run a container, but image will be pulled into engine image store
@ndeloof thanks for the explanation. Nice compose hack too. Would that result in the base image downloaded twice? That wouldn't be great if you have a slow download link.
This shows exactly my point. The way that base images are allocated to images stores is inconsistent with the current default settings of compose (with buildx as the default). Sometimes they are in the docker store, sometimes in the buildx store, depending on which part of compose configuration they were referenced in (yaml or Dockerfile). Whereas before they were all in teh same image store.
I'm happy for compose + buildx to keep the built images in its image store. But I believe compose (regardless of buildx or not) should always allow for base images to be consistently available in the docker image store, as the buildx image store gives very little control over the images inside it. This would be consistent with the past and allows full control of the download process in case it is expensive to obtain the base images.
@k1w1m8 again, this is not related to compose: you will have the exact same issue using a plain docker build ...
. Base image will be pulled by builder, and may be available in docker engine image store depending which builder you selected.
@ndeloof So you are saying that the two scenarios we have in compose are equivalent to the following docker commands
docker build
(compose Dockerfile in build folder) where the base image will be stored in buildx image store docker run
(compose image: element) where the base image will be stored in the docker image store.Somehow it seems more obvious in compose as the inconsistency can be seen as part of a single command, whereas in docker you need separate commands to show the same inconsistency.
Does that mean then that I should create a third ticket for this issue, in moby?
Compose indeed is basically just docker build then docker run. And docker build is implemented by buildx, so this is the right place to report this issue. But as I tried to explain you, your assumption regarding base image storage is only true when using the engine builder, you can't rely on this without accepting some limitations
@ndeloof I understand that the engine builder and buildx are different and use different default image stores.
If buildx was a non default option, its fine if it doesn't work exactly as the old builder. But when buildx becomes the default, then we have to make sure it is working in a way which is backwards compatible with the old builder.
This is a case where the new default behaviour is not backwards compatible with the old default behavior wrt base images, which affects those who want to want to manage their base images explicitly, and those who use docker system prune.
To make docker compose
, docker build
and docker run
store base images consistently, we want base images downloaded by buildx to appear in the docker image store.
@tonistiigi @crazy-max what do you think?
Contributing guidelines
I've found a bug and checked that ...
Description
(originally reported for docker compose https://github.com/docker/compose/issues/11346, they advised raise here)
In the days pre buildkit, all base images were tagged and stored in the main docker images list.
After buildkit, this no longer happens. Base images are downloaded into the buildkit cache.
https://github.com/moby/buildkit/issues/1537 https://github.com/docker/compose/issues/10206
This is problematic if you have large images and a slow download link and you run out of diskspace. Say you have a 10GB image and a 1MB/s download speed.
Previously, if you did docker system prune, the base image would be preserved, and you have easy control over whether to delete it or not at a later time
Now, if you do docker system prune, it will delete the 10GB image and you will start swearing as you wait 3 hours to redownload it.
If you have a look into the buildkit cache, there doesn't seem to be a lot of controls over which images to keep and which to prune. I don't see a way in docker buildx du to "keep base images".
Ideally there would be an option to tell compose/docker build to "export" base images into the main docker images list to be compatible with the previous behaviour.
If that is not possible, how can we clean the build cache while keeping downloaded base images like we did before?
Expected behaviour
see description
Actual behaviour
see description
Buildx version
github.com/docker/buildx v0.11.2 9872040
Docker info
No response
Builders list
Configuration
n/a
Build logs
No response
Additional info
No response