aws / aws-codebuild-docker-images

Official AWS CodeBuild repository for managed Docker images http://docs.aws.amazon.com/codebuild/latest/userguide/build-env-ref.html
Other
1.12k stars 977 forks source link

docker layer caching support #26

Closed deevus closed 5 years ago

deevus commented 6 years ago

How can I enable caching of docker layers between builds? Caching is one of the biggest benefits of multi-stage builds but in CodeBuild it runs every step every time.

josephvusich commented 6 years ago

CodeBuild does not currently have native support for Docker layer caching, though we are aware of the use case for it.

In the meantime, have you tried using docker save and docker load with CodeBuild's file cache functionality? You may be able to bring down your build time for complex layers in this way.

deevus commented 6 years ago

I'll try that and see how it goes. Thanks

ewolfe commented 6 years ago

@deevus I'm curious if you had any luck with docker save and docker load -- I'm also in need of caching docker layers between builds.

deevus commented 6 years ago

@ewolfe I tried it and it doesn't seem very effective. The time it takes to load/save negates any benefits at least in my case.

Here are the scripts I have written. If you try them out perhaps you can find a way to make them work in your favour. It saves all the generated images after the build since the docker host is empty (I assume) when the build runs. Apologies for the lack of comments.

cache-load.sh

#!/bin/bash
set -e

echo 'Loading docker cache...'
mkdir -p $IMAGE_CACHE_PATH

DOCKER_IMAGES_CACHE=`mktemp`
find $IMAGE_CACHE_PATH -name *.tar.gz > $DOCKER_IMAGES_CACHE

while read file; do
    echo $file
    if ! docker load -i $file; then
        echo "Error loading docker image $file. Removing..."
        rm $file
    fi
done < $DOCKER_IMAGES_CACHE

rm $DOCKER_IMAGES_CACHE

cache-save.sh

#/bin/bash
set -e

mkdir -p $IMAGE_CACHE_PATH
DOCKER_IMAGES_NEW=`mktemp`
docker images -q --no-trunc | awk -F':' '{print $2}' | sort > $DOCKER_IMAGES_NEW

DOCKER_IMAGES_CACHE=`mktemp`
find $IMAGE_CACHE_PATH -name *.tar.gz -printf '%f\n' | awk -F. '{print $1}' | sort > $DOCKER_IMAGES_CACHE

DOCKER_IMAGES_DELETE=`mktemp`
DOCKER_IMAGES_SAVE=`mktemp`
comm -13 $DOCKER_IMAGES_NEW $DOCKER_IMAGES_CACHE > $DOCKER_IMAGES_DELETE
comm -23 $DOCKER_IMAGES_NEW $DOCKER_IMAGES_CACHE > $DOCKER_IMAGES_SAVE

if [ $(< $DOCKER_IMAGES_DELETE wc -l) -gt 0 ]; then
    echo Deleting docker images that are no longer current
    < $DOCKER_IMAGES_DELETE xargs -I % sh -c "echo Deleting extraneous image % && rm $IMAGE_CACHE_PATH/%.tar.gz"
    echo
fi

if [ $(< $DOCKER_IMAGES_SAVE wc -l) -gt 0 ]; then
    echo Saving missing images to docker cache
    < $DOCKER_IMAGES_SAVE xargs -I % sh -c "echo Saving image % && docker save % | gzip -c > '$IMAGE_CACHE_PATH/%.tar.gz'"
    echo
fi

rm $DOCKER_IMAGES_NEW $DOCKER_IMAGES_CACHE $DOCKER_IMAGES_DELETE $DOCKER_IMAGES_SAVE

I don't know if I'm missing something here but a couple of the intermediate containers still build from scratch anyway, which is what I was originally trying to avoid.

EDIT: You need to set IMAGE_CACHE_PATH in your buildspec to insde the path you're caching to S3. Mine is set to /root/.docker-cache/$IMAGE_REPO_NAME

mastef commented 6 years ago

Do you run these on PRE_BUILD and POST_BUILD respectively?

deevus commented 6 years ago

Yes that's correct

AdrieanKhisbe commented 6 years ago

I also need to cache the layers between build but my attempts have been so far unsuccessful. (tried to cache /var/lib/docker/overlay, huge fail)


I have a question for you @jvusich, by

CodeBuild does not currently have native support for Docker layer caching, though we are aware of the use case for it.

Do you mean that this is somewhere on the codebuild roadmap? :)

awsnitin commented 6 years ago

Do you mean that this is somewhere on the codebuild roadmap? :)

As @jvusich mentioned, we are aware that this use case is something that we do not support natively (without custom work arounds mentioned in this issue). We've also heard about this use case before from our other customers as well. Our roadmaps are decided primarily based on customer requests and use cases. So effectively its on our radar, we cannot comment when it will be addressed.

jabalsad commented 6 years ago

Thanks @deevus! Your handy shell script made this easier. I had to make a small change to properly cache all the layers in the build: docker save needs a list of images used to build that image by running docker history.

I put my version of your script in these gists here: cache-save.sh https://gist.github.com/jabalsad/fc72503243afa76e0fbbd1349a0e4023 cache-load.sh https://gist.github.com/jabalsad/52914db52eaa01002125da9c7f85bdc8

I also put these lines in my buildspec.yml:

env:
  variables:
    IMAGE_CACHE_ROOT: /root/.docker-cache 
cache:
  paths:
    - /root/.docker-cache/*
deevus commented 6 years ago

@jabalsad How well does it work with that change? If my original script was missing a bunch of layers I would expect a decent improvement with your changes

jabalsad commented 6 years ago

It speeds up the actual build significantly, however the docker save and docker load commands now take a while to run, negating any speed improvements (possibly even slowing down the build).

The real reason I'm looking for the caching functionality is actually so that noop changes don't create a new image in ECR unnecessarily.

monken commented 6 years ago

this worked for me:

version: 0.2
phases:
  pre_build:
    commands:
      - docker version
      - $(aws ecr get-login --no-include-email)
      - docker pull $CONTAINER_REPOSITORY_URL:$REF_NAME || true
  build:
    commands:
      - docker build --cache-from $CONTAINER_REPOSITORY_URL:$REF_NAME --tag $CONTAINER_REPOSITORY_URL:$REF_NAME --tag $CONTAINER_REPOSITORY_URL:ref-$CODEBUILD_RESOLVED_SOURCE_VERSION .
  post_build:
    commands:
      - docker push $CONTAINER_REPOSITORY_URL
oba11 commented 6 years ago

@monken This worked perfectly for me, my build time reduced from 23 mins 1 sec to 1 min 8 secs. You are a real life saver

adriansecretsource commented 6 years ago

@monken After a couple of hours trying I found your solution just perfect, I managed to decrease build time a 60%!

healarconr commented 6 years ago

I think that the method mentioned by @monken (pull and cache-from) does not work with multi-stage builds because the pulled image does not have all the stages, but only the last one.

tvb commented 6 years ago

@monken I can't get this to work. It keeps invalidate the cache even at the base image 😢

kiernan commented 6 years ago

@tvb double check your docker pull command is working, I noticed mine was failing due to not having the required IAM roles yet the build continued because the pull command is in the pre-build stage.

dileep-p commented 6 years ago

@monken Worked Perfectly.. 👍

Rathgore commented 6 years ago

For anyone looking for a simple solution that works with multi-stage builds, I made a pretty simple build script that was able to meet my requirements. I was looking for a solution that would:

The basic process is this:

  1. Attempt to a pull a builder image tagged 'builder' from ECR.
  2. If the image does not exist, create it using Docker's --target option to only build the intermediate build stage from the multi-stage Dockerfile.
  3. If the image exists, pull it and rebuild it to bring in any changes. This rebuild step uses Docker's --cache-from option to cache from itself. This will pick up any changes to the build stage but will nicely result in a no-op if nothing has changed.
  4. Pull the latest tag of our target image from ECR, if it exists. This will cache the final build stage of our image. We now have Docker's cache primed with the build and final stages of our Dockerfile.
  5. Build our target Docker image using --cache-from twice to use the cache from both the build and final stages of the multi-stage build. If no code has changed, the entire build is a no-op from Docker's perspective.
  6. Push the target image to ECR.
  7. Push the newly created/updated builder image to ECR for use in subsequent builds.

Here's the basic script:

#!/usr/bin/env bash

readonly repo=${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/${REPO_NAME}

# Attempt to pull existing builder image
if docker pull ${repo}:builder; then
    # Update builder image
    docker build -t ${repo}:builder --target build --cache-from ${repo}:builder .
else
    # Create new builder image
    docker build -t ${repo}:builder --target build .
fi

# Attempt to pull latest target image
docker pull ${repo}:latest || true

# Build and push target image
docker build -t ${repo}:latest --cache-from ${repo}:builder --cache-from ${repo}:latest .
docker push ${repo}:latest

# Push builder image
docker push ${repo}:builder

The conditional logic is mainly there for clarity. The entire caching pattern can be simplified as:

docker pull ${repo}:builder || true
docker build -t ${repo}:builder --target build --cache-from ${repo}:builder .

docker pull ${repo}:latest || true
docker build -t ${repo}:latest --cache-from ${repo}:builder --cache-from ${repo}:latest .

docker push ${repo}:latest
docker push ${repo}:builder

This solution has been working well for me, and dramatically reduced our build times. It works with multiple concurrent builds and if any of the --cache-from options point to images that don't exist, Docker will just continue the build and won't use the cache for that run. The overhead of the caching system is very low and is pretty simple to understand. Thanks to @monken and others for inspiration.

judahb commented 6 years ago

@monken just curious, is $REF_NAME pulling only the specific version/tag of that container in ECR, or are you pulling all the intermediate containers? if pulling all intermediates, can you describe how that works as that sounds good, but not sure it will work for my use case.

monken commented 6 years ago

@judahb it's only pulling the last container image (including all layers). It's more likely that you will have matching layers with the latest image than any image that's older. So there is probably not a huge gain in pulling all previous images.

ngalaiko commented 6 years ago

I also pull HEAD~1 of current branch to have something cached for the first build of HEAD commit.

docker pull ${repo}:$(git rev-parse HEAD) || docker pull ${repo}:$(git rev-parse HEAD~1) || true
SpainTrain commented 6 years ago

2101012

dylanribb commented 6 years ago

For those using docker-compose, here's how I've solved the problem by adapting @monken's answer:

docker-compose.yml:

version:"3.4"
  services:
    app:
      image: my_app:latest
      build:
        cache_from: "${APP_REPOSITORY_URI:-my_app}:${DEPLOY_STAGE:-latest}" # Defaults are set so that if we build locally it will use the local cache
      environment:
        - APP_REPOSITORY_URI
        - DEPLOY_STAGE
# ... more docker-compose setup

buildspec.yml:

version: 0.2
env:
  variables:
    APP_IMAGE_REPO_NAME: "my_app"
phases:
  pre_build:
    - $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION)
    - APP_REPOSITORY_URI=`aws ssm get-parameters --names "/$DEPLOY_STAGE/ECR/app-uri" --with-decryption --region $AWS_DEFAULT_REGION --output text --query Parameters[0].Value`
    - docker pull $APP_REPOSITORY_URI:$DEPLOY_STAGE || true
    - docker pull $WEB_REPOSITORY_URI:$DEPLOY_STAGE || true
  build:
    - echo Build started on `date`
    - docker-compose build
    - echo Build completed on `date`
  post_build:
    - docker tag $APP_IMAGE_REPO_NAME:latest $APP_REPOSITORY_URI:$DEPLOY_STAGE
    - docker tag $WEB_IMAGE_REPO_NAME:latest $WEB_REPOSITORY_URI:$DEPLOY_STAGE
    - echo Images tagged
    - docker push $APP_REPOSITORY_URI:$DEPLOY_STAGE
    - docker push $WEB_REPOSITORY_URI:$DEPLOY_STAGE
    - echo Images pushed to repositories

In our case DEPLOY_STAGE is defined in the CodeBuild project as an environment variable so that we can setup different project stages fairly easily but still do somewhat dynamic builds.

eino-makitalo commented 5 years ago

Yes indeed.... waiting over 30 minutes with Step 1/16 : FROM python:3.6 3.6: Pulling from library/python

This makes CodeBuild quite unusable...

subinataws commented 5 years ago

We have added support for local caching, including Docker layers. Documentation: https://docs.aws.amazon.com/codebuild/latest/userguide/build-caching.html#caching-local

dev-walker commented 5 years ago

Sorry, it's not clear how the new feature is working.

Can anyone explain what's the reason? How make CodeBuild to use cache even if I run changes through pipeline in one week?

Rathgore commented 5 years ago

My limited experience so far is that it caches for a very short period of time. If I start repeat builds within a few minutes of each other it seems to use the cache most of the time, but any longer than that and it usually doesn’t hit the cache at all.

josephvusich commented 5 years ago

@dev-walker As explained in the documentation, the build cache is kept on local storage for maximum performance. When there are long intervals with no builds running, that underlying storage may be retired. Your first few builds in the morning may need to re-warm the cache if you ran very few builds overnight.

gabrielenosso commented 5 years ago

Can someone explain how to use @monken script step by step?

Should I use it during the creation of the image or as the buildspec on CodeBuild? (Which means my image should have docker inside?)

Sorry, I am far from being a DevOps guy..

I am using a custom Windows image, pusher on AWS ECR.

deleugpn commented 5 years ago

@josephvusich or @subinataws can we get a documentation about local cache bursting? Is it possible? Are there any plans to make it possible? Any recommended workaround?

I know I would love longer cache as I mentioned previously, but on very rare occasions I have the need to burst the local docker layers to get the build passing.

subinataws commented 5 years ago

@deleugpn - you can override the cache setting (or for that matter any project setting) when invoking the startBuild. So if you don't want to use local cache for a particular build run, select "no cache". Replied to you on slack as well.

blorenz commented 5 years ago

I'm hosting my custom built CodeBuild image on ECR and running off a base image hosted on ECR. The slow network transfer rate is what led me to caching. Local Caching seems to still be a blackbox. It's great when it hits, but when it misses, it's questionable why exactly it missed. I have tried to get more insight into the PROVISIONING stage to no avail. What exactly is going on with caching in terms of expiry and what it is caching? Could we have more visibility into the cache?

StarpTech commented 4 years ago

Today, I tried to implement --cache-from in our pipelines but if failed (commands were re-executed) because of the fact that the local docker cache is not working. I could execute the same commands locally and benefit from the cache.

We use aws/codebuild/standard:3.0 and enabled privileged mode + the local docker cache. Any idea?

dimitry commented 4 years ago

@gabrielenosso did you ever figure out how to adapt this to a Windows image?

0xMH commented 4 years ago

@StarpTech Have you reached to a solution?

dlozano commented 4 years ago

docker 1.18.09 allowed buildkit's automatic pull for caches. Blog post docker-compose 1.25.1 allows to use buildkit and therefore automatic pull from cache

There is any plan to update from 1.24 to 1.25.x ? Seems that would help with this issue.

themsaid commented 4 years ago

What's the point of local cache if 15 minutes is the maximum life span? Serious question!

ivanmartos commented 4 years ago

is there any recommended way how to cache docker layers for more than a life span of a codebuild? my UC - for every build I pull mysql image from public docker registry to execute some local integration tests. How can I cache this image between builds (Let's say I have build frequency 2 times a day)

n1ru4l commented 3 years ago

Docker registries now allow layer caching. Unfortunately, this is not supported by ECR yet. https://github.com/aws/containers-roadmap/issues/876

kylemacfarlane commented 3 years ago

None of the examples in this thread worked for me but I managed to find a solution.

They key is to enable the BuildKit inline cache.

A cut down example:

phases:
  install:
    runtime-versions:
      docker: 19
  build:
    commands:
      - echo Logging in to Amazon ECR...
      - $(aws ecr get-login --no-include-email --region $AWS_REGION)

      - echo Pulling image for cache...
      - docker pull $REPO_URI:$IMAGE_TAG || true

      - echo Building image...
      - DOCKER_BUILDKIT=1 docker build --build-arg BUILDKIT_INLINE_CACHE=1 --tag $REPO_URI:$IMAGE_TAG --cache-from $REPO_URI:$IMAGE_TAG .

The first time this runs it will still rebuild entirely but it will place an inline cache in the finished image. Then every future build will use the inline cache and be much faster.

It doesn't seem to like some parallel stages, i.e. you build in one stage and only copy out the final binary. It makes sense why they wouldn't be stored in the inline cache so you either need to store intermediary images as shown earlier in the thread or make your Dockerfile more linear.

tamsky commented 3 years ago

Given that you're using --cache-from $REPO_URI:$IMAGE_TAG, the following commands are not required (unless you want to force a complete pull for every build.)

      - echo Pulling image for cache...
      - docker pull $REPO_URI:$IMAGE_TAG || true

buildkit now knows how to pull cache layers on demand.

More info:

kylemacfarlane commented 3 years ago

Given that you're using --cache-from $REPO_URI:$IMAGE_TAG, the following commands are not required (unless you want to force a complete pull for every build.)

      - echo Pulling image for cache...
      - docker pull $REPO_URI:$IMAGE_TAG || true

I found removing this doesn't work. After ~20 mins once the CodeBuild cache expires it does a full rebuild again and there's no indication of anything getting pulled down.

I guess you could grep docker images and only pull if needed but you run the risk of slowing down frequent builds with an increasingly stale cache.

RonaldTechnative commented 1 year ago

Based on this bugreport and this source I came up with the below solution.

version: 0.2

env:
  shell: bash
phases:
  install:
    commands:
      - mkdir -vp ~/.docker/cli-plugins
      - |
        [[ ! -e ~/.docker/cli-plugins/docker-buildx ]] \
        && curl -L -o ~/.docker/cli-plugins/docker-buildx https://github.com/docker/buildx/releases/download/v0.10.3/buildx-v0.10.3.linux-amd64
      - chmod a+rx ~/.docker/cli-plugins/docker-buildx
  build:
    commands:
      - |
          docker buildx build \
             --progress=plain \
             -f ./Dockerfile \
             --push -t image:hash \
             --cache-to type=s3,region=${AWS_REGION},bucket=${DOCKER_CACHE_BUCKET_NAME},mode=max,name="frontend" \
             --cache-from type=s3,region=${AWS_REGION},bucket=${DOCKER_CACHE_BUCKET_NAME},name="frontend" \
             .

cache:
  paths:
    - /root/.docker/cli-plugins
jared-christensen commented 3 months ago

I documented what worked for me here, very close to @RonaldTechnative solution. https://jareddesign.medium.com/my-experience-getting-docker-images-to-cache-in-aws-codebuild-using-ecr-974c5d9428ec

basically I did not have to install buildx I just had to create a builder docker buildx create --use --name mybuilder --driver docker-container,

Janosch commented 1 month ago

As an example for the registry cache storage backend is missing, I am providing mine. It is similar to the inline storage backend used by @Rathgore's script, but supports multi-stage natively without having to manage the cache of the stages manually. Key features are:

Installing buildx was not necessary, I assume it is already exists in the build environment aws/codebuild/amazonlinux2-x86_64-standard:5.0 which I use. The catch is that it only works with the default docker driver, if it is configured to use containerd image store. I did not manage to get this working, therefore I created a builder using the docker-container driver, which also supports the registry cache storage backend.