moby / buildkit

concurrent, cache-efficient, and Dockerfile-agnostic builder toolkit
https://github.com/moby/moby/issues/34227
Apache License 2.0
8.06k stars 1.13k forks source link

[feature] no cache export for specific copy statements/layers #1817

Open Patrick-Remy opened 3 years ago

Patrick-Remy commented 3 years ago

It would be great, if there were an option to prevent caching (or exporting) some specific instructions in the Dockerfile. We are using buildkit in Gitlab CI and importing/exporting the cache to a folder which is auto-extracted by CI.

buildctl build \
          --frontend=dockerfile.v0 \
          --local context=$CI_PROJECT_DIR \
          --local dockerfile=$CI_PROJECT_DIR \
          --opt filename=Dockerfile \
          --opt target=$TARGET \
          --output type=image,name=$CI_REGISTRY_IMAGE/$DESTINATION:$TAG,push=true \
          --export-cache mode=max,type=local,dest=.buildkit.cache  \
          --import-cache type=local,src=.buildkit.cache

Our Dockerfile ends with these steps in each target, like probably many others copying built artefacts, node_modules or something else from previous stages. Here php sources and composer-installed dependencies.

# Copy built dependencies and src
COPY --chown=33:33 src ./src
COPY --from=php-dependencies --chown=33:33 /app/vendor ./vendor

Where src contains a huge number of files and required resources. As this layer is constantly changing and takes 80% spaces of the Docker-Image, in our context it makes no sense to exporting this layer to the cache folder. Downloading and uploading the .buildkit.cache folder takes half of the time of the full job.

Therefore I'd like to suggest to mark an instruction with a flag --no-cache or --export-cache=no to prevent exporting it. Even in the few cases between builds, when src folder did not change, the COPY instruction is faster than downloading and uploading the cached layer.

It should be considered, if this flag should also need an opt-in by the buildctl client.

As a reference the build tool kaniko did remove caching COPY layers (https://github.com/GoogleContainerTools/kaniko/pull/1408#issuecomment-705713668) by default. I think performance will be affected very differently, based on the specific Dockerfile, and not each COPY layer is ideal for ignoring. But there are many cases where this could increase the CI time massively, as also outlined in the comment. Therefore my request for some kind of a --export-cache=no flag.

Patrick-Remy commented 3 years ago

I solved this by splitting the build commands, one without pushing for the cachable layers and the second only importing the cache, but pushing the full image afterwards:

Dockerfile:

FROM composer AS php-dependencies
RUN ...

FROM php AS base
RUN ...

FROM base AS production
# Copy built dependencies and src
COPY --chown=33:33 src ./src
COPY --from=php-dependencies --chown=33:33 /app/vendor ./vendor
# `.buildkit.cache` is the extracted cache folder by GitLab CI.
# Rename cache folder and use it only for import, to prevent a growing cache
# as buildkit's export cache merges the cache (see https://github.com/moby/buildkit/issues/1850)
[ ! -d .buildkit.cache ] || mv .buildkit.cache .buildkit-import.cache

# Build until the intermediate target, but without pushing the image to only cache
# 'cachable' layers
buildctl build \
          --frontend=dockerfile.v0 \
          --local context=$CI_PROJECT_DIR \
          --local dockerfile=$CI_PROJECT_DIR \
          --opt filename=Dockerfile \
          --opt target=base \ <-- intermediate layer
          --output type=image,name=$CI_REGISTRY_IMAGE:$TAG,push=false \
          --import-cache type=local,src=.buildkit-import.cache/ \ <-- to prevent growing cache different name
          --export-cache mode=max,type=local,dest=.buildkit.cache/

# Build the target without exporting the cache, but pushing the image
buildctl build \
          --frontend=dockerfile.v0 \
          --local context=$CI_PROJECT_DIR \
          --local dockerfile=$CI_PROJECT_DIR \
          --opt filename=Dockerfile \
          --opt target=production \ <-- last layer
          --output type=image,name=$CI_REGISTRY_IMAGE/$DESTINATION:$TAG,push=true \
          --import-cache type=local,src=.buildkit.cache/

This has signaficantly increased the build speed, as the cache size has been reduced by 80%!