docker / build-push-action

GitHub Action to build and push Docker images with Buildx
https://github.com/marketplace/actions/build-and-push-docker-images
Apache License 2.0
4.31k stars 552 forks source link

GIT context ignores changes in build secrets and use GHA build cache from other build #1050

Closed piotrekkr closed 6 months ago

piotrekkr commented 8 months ago

Contributing guidelines

I've found a bug, and:

Description

I'm using build secrets with GIT context and GHA cache in CI to build and push multiple images for multiple environments. I noticed that when building same app version for staging (with staging build secret) and then for production (with prod build secret) action ignores prod secret version and uses GHA cache from staging build, resulting in staging version build as production.

Expected behaviour

Action should not use cache for steps with different build secrets and rebuild whole step again and cache it.

Actual behaviour

Even tho build secret contents are different for production, action ignores this build secret and uses already existing GHA cache from staging build. In provided YAML I created two jobs. First one is building image with secret TEST=123 and uses GHA cache. Second one depends on first one and also build image using same code but with different secret TEST=666. In last step of each job I'm running image container and displaying secret value that was created inside. As you can see in linked workflow run logs, container from second job display same secret value as it was in first job. This means it on build time in second job, secret was basically ignored.

Here is my Dockerfile used

FROM ubuntu

WORKDIR /app

COPY . .

RUN --mount=type=secret,id=dotenv-file,dst=.env \
   cp .env .env.production

Pull request I'm using for test: https://github.com/piotrekkr/build-action-secret-issue/pull/1

Repository URL

https://github.com/piotrekkr/build-action-secret-issue

Workflow run URL

https://github.com/piotrekkr/build-action-secret-issue/actions/runs/7776815181?pr=1

YAML workflow

name: Test GIT Context

on:
  push:
    branches:
      - cache-issue

permissions:
  contents: read
  packages: write

jobs:
  build-push-1:
    name: Build and Push 1
    runs-on: ubuntu-latest
    env:
      DOCKER_TAG: ghcr.io/${{ github.repository }}:ci-${{ github.run_id }}-1
    steps:
      - name: Set up Docker Buildx
        id: buildx
        uses: docker/setup-buildx-action@v3

      - name: Create dotenv
        run: echo "TEST=123" > .env

      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          push: false
          load: true
          builder: ${{ steps.buildx.outputs.name }}
          tags: ${{ env.DOCKER_TAG }}
          file: Dockerfile
          provenance: false
          cache-from: type=gha,scope=secret-issue
          cache-to: type=gha,mode=max,scope=secret-issue
          secret-files: |
            "dotenv-file=.env"

      - name: Check dotenv
        run: docker run --rm ${{ env.DOCKER_TAG }} cat .env.production

  build-push-2:
    name: Build and Push 2
    needs: [build-push-1]
    runs-on: ubuntu-latest
    env:
      DOCKER_TAG: ghcr.io/${{ github.repository }}:ci-${{ github.run_id }}-2
    steps:
      - name: Set up Docker Buildx
        id: buildx
        uses: docker/setup-buildx-action@v3

      - name: Create different dotenv
        run: echo "TEST=666" > .env

      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          push: false
          load: true
          builder: ${{ steps.buildx.outputs.name }}
          tags: ${{ env.DOCKER_TAG }}
          file: Dockerfile
          provenance: false
          cache-from: type=gha,scope=secret-issue
          cache-to: type=gha,mode=max,scope=secret-issue
          secret-files: |
            "dotenv-file=.env"

      - name: Check dotenv
        run: docker run --rm ${{ env.DOCKER_TAG }} cat .env.production

Workflow logs

No response

BuildKit logs

No response

Additional info

I'm not really sure if it is issue with this action. Maybe it is related to docker or buildx. If so let me know, I'll create issues there too. Thank you

imRentable commented 7 months ago

I can confirm this. Currently, it does not allow us to rotate credentials that we inject as build secrets without any further adjustments to the workflow being used.

piotrekkr commented 7 months ago

I also tried to change path of dotenv file like

          secret-files: |
            "dotenv-file=.env.production"

but it was ignored as well and cache was used.

edygyan commented 7 months ago

Yes, and even if the secret is part of GHA secret, the action is using cached secrets from other builds.

tonistiigi commented 7 months ago

Build secret content does not participate in cache checksum calculation. This is to protect against leaking the value. If you want to invalidate cache when secret changes, add additional build arg (with a non-secret value) that you update same time that you update the secret value. The properties of the secret, eg. mount path, do participate in cache checksum.

Also, in GHA cache you can set the scope if you want to make sure that some builds do not share the same cache with others.

piotrekkr commented 7 months ago

@tonistiigi Thanks for this answer. I did not know about this. Do you know if this is mentioned anywhere in documentation of action or in docker? Would be nice to have some mention about this.

I'll tried this trick with build arg and it works. I updated dockerfile:

FROM ubuntu

WORKDIR /app

COPY . .

ARG BUILD_ENV

RUN --mount=type=secret,id=dotenv-file,dst=.env \
   cp .env .env.production

and added changes in build jobs


      - name: Create dotenv
        run: echo "TEST=123" > .env

      - name: Build and push
        uses: docker/build-push-action@v5
        with:
          push: false
          load: true
          builder: ${{ steps.buildx.outputs.name }}
          tags: ${{ env.DOCKER_TAG }}
          build-args:
            BUILD_ENV=build-1
          file: Dockerfile
          provenance: false
          cache-from: type=gha,scope=secret-issue
          cache-to: type=gha,mode=max,scope=secret-issue
          secret-files: |
            "dotenv-file=.env"

But there is still irritating problem of changing build arg value when value of secret file changes. I could just use hash of contents of this file but since build args are left in build history then we go back to the original security issue of exposing checksum of secret file. I will probably need to introduce some encryption of this hash to make it secure or if possible use timestamp of last edit of this file or something like this.

Also based on observations of my private repo, where I encountered this, I noticed that this seems to happen only with GIT context. With path context it does not use cache at all when build secrets are involved. Are my observations correct?

Anyway, thanks for clearing this issue.

tonistiigi commented 7 months ago

Also based on observations of my private repo, where I encountered this, I noticed that this seems to happen only with GIT context. With path context it does not use cache at all when build secrets are involved. Are my observations correct?

No, the context type does not change how secrets are cached. There might be some other reason why the cache gets invalidated when you use local paths that doesn't appear in git checkout (eg. nondeterministic checkout of .git directory).

edygyan commented 7 months ago

Build secret content does not participate in cache checksum calculation. This is to protect against leaking the value. If you want to invalidate cache when secret changes, add additional build arg (with a non-secret value) that you update same time that you update the secret value. The properties of the secret, eg. mount path, do participate in cache checksum.

Also, in GHA cache you can set the scope if you want to make sure that some builds do not share the same cache with others.

Thanks for your reply, but I think I was not talking about GHA cache in particular, the scenarios was: 1) Clone a repo X to create new repo Y 2) Repo X had all the GHA vars secretes 3) Repo Y did not had any GHA vars or secrets 4) On Github Action workflow of repo Y, login to ECR action succussed even though there were no secrets defined for this repo in the GHA secrets, so I am not sure from where it picked the credentials, although it failed while pushing the image, so probably it picked some other credential which was valid but not applicable to the ECR repo I was pushing to.

So this is something which I can't understand unless there is a bug.

Regards

piotrekkr commented 7 months ago

@tonistiigi You are right again. I did some testing with dockerignore set like this:

*
!README.md

and it used cache in both jobs. Seems like some other action created files or updated them and they were not ignored in dockerignore file. Thanks for clarifying this. Now I just need to deal with dynamic build arg value generation when secret changes.

@edygyan Afaik cache is stored at the repo level and scoped to branches. Repository do not share cache with other repositories. To test what is going on you should probably do something similar as I did in my repo (https://github.com/piotrekkr/build-action-secret-issue/pull/1)

  1. simplify jobs and make two of them one depending on another
  2. remove cache that is currently in repo to be sure that you start clean
  3. run workflows and at the end run build images and display data that shuold not be cached to be sure if cache was used or not
  4. also you can try ignoring all files by default in dockerignore and then include only directories and files you need for build (to exclude possible random files generated by other actions) or you can checkout code into subdirectory and use this subdir as context