moby / buildkit

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
https://github.com/moby/moby/issues/34227
Apache License 2.0
8.06k stars 1.13k forks source link

Cache pushed from one machine can not be reused on another machine #1251

Open kindermax opened 4 years ago

kindermax commented 4 years ago

Hi. Currently, I am trying to set up an infrastructure for reusing caches for local (dev machine) builds.

Here are the steps:

  1. Build an image using docker buildx build in CI (Gitlab CI) - Docker Engine version 19.03
  2. Push image from CI to Gitlab Registry
docker buildx build . \
      -t registry.my-company-gitlab.com/app:latest \
      -f ./docker/Dockerfile.$IMAGE \
      --cache-from=type=registry,ref=registry.my-company-gitlab.com/app:latest \
      --cache-to=type=registry,ref=registry.my-company-gitlab.com/app:latest,mode=max \
      --push 

Dockerfiles are with multistage builds.

  1. Then I run build on my laptop and expecting to reuse cache from registry image (built and pushed from 1 and 2 steps)
docker buildx build -t my-local-image -f Dockerfile.app --cache-from=type=registry,ref=registry.my-company-gitlab.com/app:latest --load .

Before any builds, I run

docker builder prune
docker system prune -a
  1. But the new build it not reusing any of cache and starts building from scratch
tonistiigi commented 4 years ago

Please post a reproducible testcase that we could run to figure this out.

One thing I noticed is that you are using the same reference for cache and your image, so unless this is just a mistake on the report this is definitely wrong. Also, until recently github registry didn't support manifest lists that are used in the external cache format and mulit-platform image, so I'm surprised you made that far.

kindermax commented 4 years ago

Hi, thank you for the quick response.

I've set up a test project to reproduce the cache issue.

https://gitlab.com/kindritskiy.m/docker-cache-issue

It is not a multistage build. It is a GitLab CI (not Github one)

you are using the same reference for cache and your image

I am not sure I understand what you mean. Do you mean this two lines

--cache-from=type=registry,ref=registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest \
      --cache-to=type=registry,ref=registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest,mode=max \

or this

-t registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest
  1. the job, which builds and pushes image - https://gitlab.com/kindritskiy.m/docker-cache-issue/-/jobs/348012533
  2. I am trying to build an image locally with the cache
    docker buildx build -t my-local-image -f Dockerfile --cache-from=type=registry,ref=registry,ref=registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest --load .
    [+] Building 28.2s (11/11) FINISHED                                                                     
    => importing cache manifest from registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest      3.4s
    => [internal] load .dockerignore                                                                  0.1s
    => => transferring context: 2B                                                                    0.0s
    => [internal] load build definition from Dockerfile                                               0.1s
    => => transferring dockerfile: 127B                                                               0.0s
    => [internal] load metadata for docker.io/library/node:12-alpine                                  1.7s
    => [internal] load build context                                                                  0.1s
    => => transferring context: 462B                                                                  0.0s
    => [1/5] FROM docker.io/library/node:12-alpine@sha256:50ce309a948aaad30ee876fb07ccf35b62833b27de  0.0s
    => CACHED [2/5] WORKDIR /app                                                                     20.8s
    => => pulling sha256:e7c96db7181be991f19a9fb6975cdbbd73c65f4a2681348e63a141a2192a5f10             2.7s
    => => pulling sha256:7b373bfb6ac5ffc0602bd1033666f9138fc137d68f67c4d4726cd6ac0c6bc9ac            18.9s
    => => pulling sha256:fd38342e03373b2ef6a3c7f354adf8d165454628b1de8c8329e70b3ef6325710             1.3s
    => => pulling sha256:5269cc77d334b68485968973d8e40df41f5d712a0cc66580bf9d925e5da6b923             1.4s
    => => pulling sha256:eafd5e4882da62f3d333317fc4fa6322755c84c6cb978dc3b962180455760a98             0.6s
    => [3/5] COPY package.json .                                                                      0.9s
    => [4/5] COPY requirements.txt .                                                                  0.1s
    => [5/5] RUN npm i                                                                                2.8s
    => exporting to image                                                                             0.2s
    => => exporting layers                                                                            0.1s
    => => writing image sha256:32fbc089de4f0b350a0362b5866345f019032d2b69e8e29092f9ce1639f02c34       0.0s 

On CI I use https://github.com/docker/buildx/releases/download/v0.3.1/buildx-v0.3.1.linux-amd64 binary. On my laptop I already have a bundled version of buildx - same on CI

docker buildx version
github.com/docker/buildx v0.3.1 6db68d029599c6710a32aa7adcba8e5a344795a7 
tonistiigi commented 4 years ago

Do you mean this two lines

You can't use the same ref on --cache-* and -t because they are different objects and pushed separately. (The exception here would be inline cache that would not push a separate object but append metatdata to image config).

tonistiigi commented 4 years ago

I can reproduce your case but if I push my own image with docker buildx build --cache-to type=registry,ref=tonistiigi/build-cache-issue:latest,mode=max . then now running docker buildx build --cache-from tonistiigi/build-cache-issue:latest . seems to work fine for whole build.

kindermax commented 4 years ago

Thank you, I now know that tag and cache cant be same. I did no find any docs about that so its good to know it from you. Will try fix my builds. But one question is still bothering me - if i build and push image from my laptop to ci then i can reuse all the cache. It really works. But what about building on one machine (lets say ci server) and using that cache on another machine? Will it wotk?

tonistiigi commented 4 years ago

@kindritskyiMax Yes, it should work if you switch machines. Did you try if my cache works for you? So are you saying that --cache-to works for you (even when switching machine) but does not work if you export from a specific machine?

kindermax commented 4 years ago

If image was build and pushed fro my machine and used only on my machine, then cache works. But when using different machines - cache not works

kindermax commented 4 years ago

I will try your image when will be at my laptop. Will let you know. Thank you.

tonistiigi commented 4 years ago

@kindritskyiMax And that is even if you remove your local cache to make sure remote cache is used? In https://gitlab.com/kindritskiy.m/docker-cache-issue/-/jobs/348012533 I also see that the remote cache was used, so is it that cache exported in ci only works when importing to ci machines(with fresh state).

Or does it have something to do with exporting cache that has already been imported like it happens in https://gitlab.com/kindritskiy.m/docker-cache-issue/-/jobs/348012533 ?

kindermax commented 4 years ago

And that is even if you remove your local cache to make sure remote cache is used?

Yes, I am removing everything to be sure I use only remote cache.

I have noticed as well that ci servers can reuse cache built on ci servers. But my laptop can not reuse that cache.

kindermax commented 4 years ago

@tonistiigi I've tried build image from your cache and it didn't work.

docker buildx build . -t my-from-cache -f Dockerfile --cache-from tonistiigi/build-cache-issue:latest
[+] Building 10.9s (11/11) FINISHED                                                                     
 => importing cache manifest from tonistiigi/build-cache-issue:latest                              2.7s
 => [internal] load .dockerignore                                                                  0.1s
 => => transferring context: 2B                                                                    0.0s
 => [internal] load build definition from Dockerfile                                               0.1s
 => => transferring dockerfile: 127B                                                               0.0s
 => [internal] load metadata for docker.io/library/node:12-alpine                                  2.0s
 => [1/5] FROM docker.io/library/node:12-alpine@sha256:50ce309a948aaad30ee876fb07ccf35b62833b27de  0.0s
 => [internal] load build context                                                                  0.1s
 => => transferring context: 462B                                                                  0.0s
 => CACHED [2/5] WORKDIR /app                                                                      4.8s
 => => pulling sha256:e7c96db7181be991f19a9fb6975cdbbd73c65f4a2681348e63a141a2192a5f10             0.9s
 => => pulling sha256:7b373bfb6ac5ffc0602bd1033666f9138fc137d68f67c4d4726cd6ac0c6bc9ac             3.4s
 => => pulling sha256:fd38342e03373b2ef6a3c7f354adf8d165454628b1de8c8329e70b3ef6325710             0.8s
 => => pulling sha256:5269cc77d334b68485968973d8e40df41f5d712a0cc66580bf9d925e5da6b923             0.2s
 => => pulling sha256:841db4349b44a385cef1a07272aa772b5c7dd4ec84638734f9b9190a82d07125             0.3s
 => [3/5] COPY package.json .                                                                      0.8s
 => [4/5] COPY requirements.txt .                                                                  0.1s
 => [5/5] RUN npm i                                                                                2.2s
 => exporting to image                                                                             0.1s
 => => exporting layers                                                                            0.1s
 => => writing image sha256:3f4e7189d3ac2931b0d9b72bd8b802c1833cfdd33fc8e6c8d57c458faeea3c0d       0.0s 
kindermax commented 4 years ago

I've updated build command on CI. https://gitlab.com/kindritskiy.m/docker-cache-issue/blob/master/build_cache.sh#L31

Here is the new job - https://gitlab.evo.dev/m.kindritskiy/docker-cache/-/jobs/3160699. Still, I can not reuse cache on my laptop by building with the command (buildx is committed to repo)

./docker-buildx build -f Dockerfile --cache-from registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest --load .
#####
[+] Building 11.8s (11/11) FINISHED                                                                                                                                                                                                  
 => importing cache manifest from registry.gitlab.com/kindritskiy.m/docker-cache-issue:latest                                                                                                                                   3.5s
 => [internal] load .dockerignore                                                                                                                                                                                               0.1s
 => => transferring context: 2B                                                                                                                                                                                                 0.0s
 => [internal] load build definition from Dockerfile                                                                                                                                                                            0.1s
 => => transferring dockerfile: 127B                                                                                                                                                                                            0.0s
 => [internal] load metadata for docker.io/library/node:12-alpine                                                                                                                                                               1.3s
 => [1/5] FROM docker.io/library/node:12-alpine@sha256:50ce309a948aaad30ee876fb07ccf35b62833b27de4d3a818295982efb04ce6b                                                                                                         0.0s
 => [internal] load build context                                                                                                                                                                                               0.1s
 => => transferring context: 462B                                                                                                                                                                                               0.0s
 => CACHED [2/5] WORKDIR /app                                                                                                                                                                                                   4.9s
 => => pulling sha256:e7c96db7181be991f19a9fb6975cdbbd73c65f4a2681348e63a141a2192a5f10                                                                                                                                          1.1s
 => => pulling sha256:7b373bfb6ac5ffc0602bd1033666f9138fc137d68f67c4d4726cd6ac0c6bc9ac                                                                                                                                          3.0s
 => => pulling sha256:fd38342e03373b2ef6a3c7f354adf8d165454628b1de8c8329e70b3ef6325710                                                                                                                                          1.4s
 => => pulling sha256:5269cc77d334b68485968973d8e40df41f5d712a0cc66580bf9d925e5da6b923                                                                                                                                          0.5s
 => => pulling sha256:1cb335f2ff4b7a9042d20028e41736ea2ccd5e5961d308cd1f780ff65b2c1956                                                                                                                                          0.8s
 => [3/5] COPY package.json .                                                                                                                                                                                                   1.2s
 => [4/5] COPY requirements.txt .                                                                                                                                                                                               0.1s
 => [5/5] RUN npm i                                                                                                                                                                                                             1.8s
 => exporting to image                                                                                                                                                                                                          0.1s
 => => exporting layers                                                                                                                                                                                                         0.1s
 => => writing image sha256:2fd26692991a805e65ace943a3f03ac029f6111e4dc62329cb6c3a5b5006f729  
kindermax commented 4 years ago

@tonistiigi Hi. Do you have any ideas/suggestions on how to fix this reusing cache thing between machines?

satazor commented 4 years ago

I'm having the exact problem with a CI/CD pipeline on GitLab when mixing shared runners and private runners. Cache is only reused if a private runner uses --cache-from pointing to an image made from a private runner as well. The same applies to shared runners: cache is only reused if a shared runner uses --cache-from pointing to an image made from a shared runner.

@kindritskyiMax did you managed to get around this?

kindermax commented 4 years ago

I didn't solve exactly this problem with builtin cache reuse. But I've found another solution, not perfect but at least allows to reduce amount of building on dev machine.

Workaround.

I've created a python script that calculates the checksum of some files which are considered as dependencies Dockerfile relies on. For example, for a python project, I have requirements.txt and requirements-dev.txt.

There is some file, called checksum-deps.txt and it contains two lines with requirements.txt and requirements-dev.txt.

Using make, each time I run some command, like make run, it calculates checksum and if for that checksum I do not have an image with that tag locally, I am trying to pull myimage:<checksum>.

Also, I have a periodic job in Gitlab (like every 30 minutes), that calculates checksums, pushes images, so I almost always have an image for new checksum if some of the dependencies files has changed.

There is quite a lot of machinery but checksum misses is almost 0.