GoogleContainerTools / kaniko

Build Container Images In Kubernetes
Apache License 2.0
14.82k stars 1.44k forks source link

Issue with Docker Image build using Kaniko and push to AWS ECR in GitHub Actions Workflow on AWS EKS #3027

Open aswin-vijayan opened 8 months ago

aswin-vijayan commented 8 months ago

Expected behavior I expected the Docker image build and push process to complete successfully using Kaniko in GitHub Actions, similar to when the EC2InstanceProfileForImageBuilderECRContainerBuilds policy is directly attached to an EC2 instance. The role and policy should allow Kaniko to authenticate with AWS ECR and push the built image without any issues, regardless of whether it's being used through an EC2 instance role or a pod identity/service account in EKS.

Actual behavior When using the role and policy through pod identity or in a Kubernetes service account for building and pushing Docker images with Kaniko in GitHub Actions, the process fails with the following error message:

error checking push permissions -- make sure you entered the correct tag name, and that you are authenticated correctly, and try again: checking push permission for "xxxxxxxxxx.dkr.ecr.us-west-2.amazonaws.com/pet-clinic:6.0.0": Post "https://xxxxxxxxxx.dkr.ecr.us-west-2.amazonaws.com/v2/pet-clinic/blobs/uploads/": EOF
Error: Error: failed to run script step: command terminated with non-zero exit code: error executing command [sh -e /__w/_temp/c9f63bc0-d46f-11ee-a3fc-b524ab2f1fed.sh], exit code 1
Error: Process completed with exit code 1.
Error: Executing the custom container implementation failed. Please contact your self hosted runner administrator.

Workflow YAML file

The YAML file I used in Github Actions to build and push image to ECR with Kaniko is given belwo

name: Build and Push

on:
  push:
    branches:
      - main

jobs:
  build:
    runs-on: arc-runner-set
    container:
      image: gcr.io/kaniko-project/executor:debug
    permissions:
      contents: read
      packages: write

    steps:
      - name: Build and push container test
        run: |
          /kaniko/executor --dockerfile="/Dockerfile.test" \
            --context="git://github.com/xxxxxxxxxxxxxxxxx" \
            --destination="xxxxxxxxx.dkr.ecr.us-west-2.amazonaws.com/pet-clinic:6.0.0"

Request for Assistance

ScottCruzen0 commented 7 months ago

I've been attempting to setup the same thing. I think that in your case, it seems like you don't have a /kaniko/.docker/config.json in place that contains { "credsStore": "ecr-login" }.

Once that's in place, you can try dumping the contents of /root/.ecr/log/ecr-login.log after running executor to see what docker-credential-ecr-login is doing.

    steps:
      - name: Build and Push Image to ECR with Kaniko
        run: |
          set +e
          echo '{ "credsStore": "ecr-login" }' > /kaniko/.docker/config.json

          /kaniko/executor <your kaniko args here>

          cat /root/.ecr/log/ecr-login.log

To verify that your pod has the right role you could also try: echo -n https://your_ecr_repo | docker-credential-ecr-login But note that this will spew an ECR token into the GH runner logs, so probably don't do this.

I get the same result error checking push permissions -- make sure you entered the correct tag name, and that you are authenticated correctly, and try again: checking push permission for "url": Post "url": EOF

I'm not sure if AWS changed something, but using executor:v1.19.2-debug has suddenly started working (and it provides a better error message). I'll go back to the latest after this build completes. In my case, part of the problem was that cross account ECR perms weren't setup correctly.

pgpx commented 6 months ago

I had to give my IAM role extra permissions to ecr:GetAuthorizationToken for resources: *, in addition to permission to push/pull to selected ECR repos (I didn't have to do that for Kaniko 1.9.1). I didn't need to define a /kaniko/.docker/config.json or set AWS_SDK_LOAD_CONFIG either.

"Statement": [
        {
            "Action": "ecr:GetAuthorizationToken",
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "EcrGetAuthForKaniko"
        }
    ],
gnadaban commented 1 week ago

Just went through this ordeal, and assuming you are running your actions runner scalesets in dind mode, you will need to mount the service account volume in the container job as mentioned here:

name: Build and Push

on:
  push:
    branches:
      - main

jobs:
  build:
    runs-on: arc-runner-set
    container:
      image: gcr.io/kaniko-project/executor:debug

      # In DinD mode the container is created by the runner Pod, so
      # the service account volume is not automatically mounted by EKS.
      volumes:
        - /var/run/secrets/eks.amazonaws.com/serviceaccount/token:/var/run/secrets/eks.amazonaws.com/serviceaccount/token

    permissions:
      contents: read
      packages: write

    steps:
      - name: Build and push container test
        run: |
          /kaniko/executor --dockerfile="/Dockerfile.test" \
            --context="git://github.com/xxxxxxxxxxxxxxxxx" \
            --destination="xxxxxxxxx.dkr.ecr.us-west-2.amazonaws.com/pet-clinic:6.0.0"