[BUG] changes to docker system prune with buildkit, now deletes downloaded base images

k1w1m8 commented 8 months ago

Contributing guidelines

[X] I've read the contributing guidelines and wholeheartedly agree

I've found a bug and checked that ...

[X] ... the documentation does not mention anything about my problem
[X] ... there are no open or closed issues that are related to my problem

Description

(originally reported for docker compose https://github.com/docker/compose/issues/11346, they advised raise here)

In the days pre buildkit, all base images were tagged and stored in the main docker images list.

After buildkit, this no longer happens. Base images are downloaded into the buildkit cache.

https://github.com/moby/buildkit/issues/1537 https://github.com/docker/compose/issues/10206

This is problematic if you have large images and a slow download link and you run out of diskspace. Say you have a 10GB image and a 1MB/s download speed.

Previously, if you did docker system prune, the base image would be preserved, and you have easy control over whether to delete it or not at a later time

Now, if you do docker system prune, it will delete the 10GB image and you will start swearing as you wait 3 hours to redownload it.

If you have a look into the buildkit cache, there doesn't seem to be a lot of controls over which images to keep and which to prune. I don't see a way in docker buildx du to "keep base images".

Ideally there would be an option to tell compose/docker build to "export" base images into the main docker images list to be compatible with the previous behaviour.

If that is not possible, how can we clean the build cache while keeping downloaded base images like we did before?

Expected behaviour

see description

Actual behaviour

see description

Buildx version

github.com/docker/buildx v0.11.2 9872040

Docker info

No response

Builders list

NAME/NODE DRIVER/ENDPOINT STATUS  BUILDKIT             PLATFORMS
default * docker                                       
  default default         running v0.11.7+d3e6c1360f6e linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/386

Configuration

n/a

Build logs

No response

Additional info

No response

tonistiigi commented 8 months ago

If you have a look into the buildkit cache, there doesn't seem to be a lot of controls over which images to keep and which to prune.

You can filter by property, age or storage size. Size is the most automatic way, letting the builder decide the cache that is least likely to be needed.

If you want to use docker images for pinning, then you need to have something that calls docker pull to pull a specific set of images to docker's image store. This would then be not released by prune without --all and it is your responsibility to manually delete them when needed. The deletion criteria has not been changed. Previous side-effects of the old builder have been fixed and it creates images per user request and as defined by the -t build flag. Pinning the images with docker pull does not duplicate their storage size if the same layers are also used in some builder cache.

crazy-max commented 8 months ago

If you have a look into the buildkit cache, there doesn't seem to be a lot of controls over which images to keep and which to prune.

You can filter by property, age or storage size. Size is the most automatic way, letting the builder decide the cache that is least likely to be needed.

If you want to use docker images for pinning, then you need to have something that calls docker pull to pull a specific set of images to docker's image store. This would then be not released by prune without --all and it is your responsibility to manually delete them when needed. The deletion criteria has not been changed. Previous side-effects of the old builder have been fixed and it creates images per user request and as defined by the -t build flag. Pinning the images with docker pull does not duplicate their storage size if the same layers are also used in some builder cache.

That might be smth we could clarify in docs (cc @dvdksn)

k1w1m8 commented 8 months ago

The issue here is that I don't have direct control over docker pull as I am using compose, using a mixture of image: tags and Dockerfile FROMs.

@ndeloof from compose team has said they need input from buildx how to solve this (see https://github.com/docker/compose/issues/11346).

tonistiigi commented 8 months ago

If compose wants to add something like pin that instead of building would download the images, we could theoretically provide something that gets the image names from dockerfile so compose doesn't need to parse it itself. It is tricky tough as there are cases where base images can not be determined without running at least a partial build.

For getting these images after build, this is already possible with --metadata-file option in buildx today that captures all the image sources, so that list could be then used for pinning or other storage actions. It is worth noting that the format of this output might change in the future as we are in the process of normalizing to SLSA spec types.

k1w1m8 commented 8 months ago

Thanks @tonistiigi, @crazy-max, @ndeloof

Seems the issue is in the details of the interface between compose and buildx teams.

How best to progress such issues which require intra team co-ordination and possibly coordinated changes on both ends?

ndeloof commented 8 months ago

IMHO there's some misunderstanding here. Compose does not control the build, it just configure buildx (vendored) to build required images. The fact a non-default builder won't store the base image in moby image store, and as such those must be (re)downloaded is not (yet) a build option we can enable

k1w1m8 commented 7 months ago

@ndeloof I'm not sure what the misunderstand is that you refer to.

To me this is a docker internal issue where two parts of docker need to work together to fix a regression related to the purging of downloaded images vs built images.

ndeloof commented 7 months ago

@k1w1m8 that's what I mean: this isn't a docker compose feature, never has been. You just expect the builder to store base image in engine local cache, this isn't something exposed by the builder API we can manage.

k1w1m8 commented 7 months ago

@ndeloof I understand that.

However, think of it from the perspective of the "customer". I'm a user of docker compose. Compose stopped doing what it used to do, due to changes "under the hood" of compose. which I am only vaguely aware of.

Should I then look into the internal implementation of compose to determine where precisely the issue is located and raise separate tickets on the internal compose components in the hope that fixes will eventually show up in my usage of compose?

What if there are API changes between compose and the internal components required? Should I negotiate those API changes with the two teams?

Or should the compose team talk to the internal component teams in order to decide what API changes are required to surface the correct behaviour to the compose customer?

ndeloof commented 7 months ago

@k1w1m8 this is not specific to compose, you have the same issue with a plain docker build... This is really internal to the builder

k1w1m8 commented 7 months ago

@ndeloof I take your point, perhaps I should have created this ticket on moby rather than buildx?

However this issue is not purely about functionality from docker eg docker build and Dockerfiles. There is also the image: tags inside the compose files. For the new compose to be consistent with older compose, we'd like control of the downloaded base images to be consistent no matter if done via compose image tags or via building Dockerfiles.

Which component of docker is responsible for compose image tags?

ndeloof commented 7 months ago

There is also the image: tags inside the compose files. (...) we'd like control of the downloaded base images to be consistent no matter if done via compose image tags or via building Dockerfiles.

I don't get it here, the base image is only declared in your Dockerfile using FROM instruction. tags set in compose is for the image being built (same as docker build --tag), can you please clarify your point ? Maybe a minimal example would help

k1w1m8 commented 7 months ago

@ndeloof

Ah yes, I can see where I went wrong explaining that. Sigh.

A compose project can contain a direct image reference in the yaml, or it can contain FROM in a Dockerfile to be built.

https://docs.docker.com/compose/compose-file/compose-file-v3/#image

Example:

services:
  mycontainer:
    image: myrepo.com/myimage:1.2.3

So therefore any changes to the way that downloaded images are stored would need to be consistent between

image download during the Dockerfile build process, via buildx?
compose image download during the image: processing, via compose?

Or are you saying that buildx does the downloading for both these scenarios?

ndeloof commented 7 months ago

image download during the Dockerfile build process, via buildkit, are store in the builder - when using the one backed by docker engine, this is actually the same as docker engine image store compose image download during the image: are the same as docker pull, and are stored in docker engine image store.

the fact your Dockerfile's base image is tagged and get available in docker image store is a side-effect of using engine builder, if you run build with an alternate builder this won't be the case.

If you want to enforce base image to be pulled, you can use this compose hack:

services:
  base-image:
    scale: 0
    image: myrepo.com/myimage:1.2.3

service won't actually run a container, but image will be pulled into engine image store

k1w1m8 commented 7 months ago

@ndeloof thanks for the explanation. Nice compose hack too. Would that result in the base image downloaded twice? That wouldn't be great if you have a slow download link.

This shows exactly my point. The way that base images are allocated to images stores is inconsistent with the current default settings of compose (with buildx as the default). Sometimes they are in the docker store, sometimes in the buildx store, depending on which part of compose configuration they were referenced in (yaml or Dockerfile). Whereas before they were all in teh same image store.

I'm happy for compose + buildx to keep the built images in its image store. But I believe compose (regardless of buildx or not) should always allow for base images to be consistently available in the docker image store, as the buildx image store gives very little control over the images inside it. This would be consistent with the past and allows full control of the download process in case it is expensive to obtain the base images.

ndeloof commented 7 months ago

@k1w1m8 again, this is not related to compose: you will have the exact same issue using a plain docker build .... Base image will be pulled by builder, and may be available in docker engine image store depending which builder you selected.

k1w1m8 commented 7 months ago

@ndeloof So you are saying that the two scenarios we have in compose are equivalent to the following docker commands

docker build (compose Dockerfile in build folder) where the base image will be stored in buildx image store
docker run (compose image: element) where the base image will be stored in the docker image store.

Somehow it seems more obvious in compose as the inconsistency can be seen as part of a single command, whereas in docker you need separate commands to show the same inconsistency.

Does that mean then that I should create a third ticket for this issue, in moby?

ndeloof commented 7 months ago

Compose indeed is basically just docker build then docker run. And docker build is implemented by buildx, so this is the right place to report this issue. But as I tried to explain you, your assumption regarding base image storage is only true when using the engine builder, you can't rely on this without accepting some limitations

k1w1m8 commented 7 months ago

@ndeloof I understand that the engine builder and buildx are different and use different default image stores.

If buildx was a non default option, its fine if it doesn't work exactly as the old builder. But when buildx becomes the default, then we have to make sure it is working in a way which is backwards compatible with the old builder.

This is a case where the new default behaviour is not backwards compatible with the old default behavior wrt base images, which affects those who want to want to manage their base images explicitly, and those who use docker system prune.

To make docker compose, docker build and docker run store base images consistently, we want base images downloaded by buildx to appear in the docker image store.

@tonistiigi @crazy-max what do you think?

docker / buildx