GoogleContainerTools / kaniko

Build Container Images In Kubernetes
Apache License 2.0
14.77k stars 1.44k forks source link

Kaniko can't auth to AWS using IRSA #2526

Open ReeSilva opened 1 year ago

ReeSilva commented 1 year ago

Actual behavior When running kaniko within a Gitlab Job in a k8s pod gitlab runner, even with the right service account properly annotated, kanico is not being able to authenticate in AWS ECR.

Expected behavior When kaniko ran from a Gitlab Runner pod, it should still be able to authenticate to ECR using IRSA.

To Reproduce Steps to reproduce the behavior:

  1. Deploy a Gitlab Runner on an AWS EKS Cluster with the following config:
    
    imagePullPolicy: Always
    gitlabUrl: https://gitlab.com/
    unregisterRunners: true
    concurrent: 20
    checkInterval: 30
    logLevel: warn
    rbac:
    create: true
    serviceAccountName: gitlab-runner
    clusterWideAccess: false
    serviceAccountAnnotations:
    eks.amazonaws.com/role-arn: [REDACTED]

metrics: enabled: true

runners: tags: [REDACTED] secret: [REDACTED] privileged: true outputLimit: 10240 config: | [[runners]] environment = ["FF_GITLAB_REGISTRY_HELPER_IMAGE=1", "AWS_DEFAULT_REGION=eu-west-1"] [runners.kubernetes] image = "alpine:latest" cpu_request = "400m" memory_request = "1024Mi" service_cpu_request = "200m" service_memory_request = "256Mi" request_concurrency = 10 pull_policy = "if-not-present" service_account = "gitlab-runner" service_account_overwrite_allowed = "^gitlab-runner$" [runners.kubernetes.node_selector] [REDACTED] [runners.cache] Type = "s3" Path = "cache" Shared = true [runners.cache.s3] BucketName = "bucket-cache" BucketLocation = "eu-west-1"

Configure resource requests and limits

ref: http://kubernetes.io/docs/user-guide/compute-resources/

resources: limits: memory: 512Mi cpu: 400m requests: memory: 256Mi cpu: 200m

2. Configure a job on Gitlab CI with the following configs:

.kaniko_publish_to_ecr: needs: ["set_image_tags"] dependencies:

Additional Information

ENV TOMCAT_WEBAPP_HOME /usr/local/tomcat/webapps ARG CONTEXT=ROOT ARG PROJECT_NAME=

COPY ./target/$PROJECT_NAME.war $TOMCAT_WEBAPP_HOME/$CONTEXT.war COPY ./target/$PROJECT_NAME $TOMCAT_WEBAPP_HOME/$CONTEXT


 - Build Context
    The `target` folder is created in a previous Gitlab CI step, and they exists, are passed through artifacts.
 - Kaniko Image (fully qualified with digest): `gcr.io/kaniko-project/executor:debug`:`sha256:964426c9205d644e2964869d1d311a05dc9f301594300d3732ea26b5733e94fc`

I'm trying to push the image to an ECR repository, with authentication through IRSA. The pod for the gitlab executor has the right web identity token properly mounted to the container.

Funny thing is that if I run the pod with `kubectl run` using the same svc account, works like a charm, but it fails when running from Gitlab CI.

There is a short version of the pod description: https://gitlab.com/-/snippets/2546250

 **Triage Notes for the Maintainers**
 <!-- πŸŽ‰πŸŽ‰πŸŽ‰ Thank you for an opening an issue !!! πŸŽ‰πŸŽ‰πŸŽ‰
We are doing our best to get to this. Please help us by helping us prioritize your issue by filling the section below -->

 | **Description** | **Yes/No** |
 |----------------|---------------|
 | Please check if this a new feature you are proposing        | <ul><li>- [ ] </li></ul>|
 | Please check if the build works in docker but not in kaniko | <ul><li>- [x] </li></ul>| 
 | Please check if this error is seen when you use `--cache` flag | <ul><li>- [ ] </li></ul>|
 | Please check if your dockerfile is a multistage dockerfile | <ul><li>- [ ] </li></ul>| 
bobbywatson3 commented 1 year ago

We are also seeing failures when trying to use kaniko with IRSA. Builds work with KIAM and with AWS access keys, but not with IRSA.

mifonpe commented 1 year ago

Same for us, any updates?

cgill27 commented 1 year ago

I'm seeing the same issue, but it only started occurring for me when upgrading the Gitlab runner version to current version 16.4.1. I can roll back to version 15.8.3 for example and no IRSA issue for kaniko pushing to ECR.

balonik commented 6 months ago

I am facing similar issue, but in my case Kaniko is not assuming IRSA at all, It is using the EC2 Instance IAM role. I have this issue with GitLab Runner created Pod, but for troubleshooting purposes I have created my own Pod and the issue is the same. In my case it is not happening just on GitLab Runner job pods.

My Pod spec:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kaniko
  namespace: gitlab-runner
data:
  Dockerfile: |
    FROM debian:latest
    RUN rm -rf /aws /usr/local/aws-cli
    RUN apt-get update && apt-get install -y --no-install-recommends \
      less \
      ca-certificates \
      curl \
      unzip \
      && curl -sS "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" \
      && unzip awscliv2.zip \
      && ./aws/install
    RUN /usr/local/bin/aws --version
    RUN AWS_PAGER="" /usr/local/bin/aws sts get-caller-identity
---
apiVersion: v1
kind: Pod
metadata:
  labels:
    run: kaniko
  name: kaniko
  namespace: gitlab-runner
spec:
  serviceAccount: gitlab-runner
  serviceAccountName: gitlab-runner
  containers:
    - name: kaniko
      image: gcr.io/kaniko-project/executor:latest
      args:
        - "--dockerfile=/workspace/Dockerfile"
        - "--no-push"
        - "--cache=false"
      env:
      - name: AWS_SDK_LOAD_CONFIG
        value: "true"
      - name: AWS_EC2_METADATA_DISABLED
        value: "true"
      volumeMounts:
        - name: dockerfile
          mountPath: /workspace
          readOnly: true
  volumes:
    - name: dockerfile
      configMap:
        name: kaniko

Running this Pod results in:

INFO[0035] Running: [/bin/sh -c AWS_PAGER="" /usr/local/bin/aws sts get-caller-identity] 
{
    "UserId": "[REDUCTED]",
    "Account": "[REDUCTED]",
    "Arn": "arn:aws:sts::[REDUCTED]:assumed-role/dev-karpenter-eks-node-group/i-0b8337aa48bfed98a"
}

instead of expected:

{
    "UserId": "[REDUCTED]",
    "Account": "[REDUCTED]",
    "Arn": "arn:aws:sts::[REDUCTED]:assumed-role/dev-gitlab-runner/botocore-session-1613826698"
}

Am I missing something from my spec for Kaniko to assume IRSA role?

Kaniko version: 1.22.0

balonik commented 6 months ago

I figured it out and ended up adding ARG and ENV into my Dockerfile

ARG AWS_ROLE_ARN
ARG AWS_WEB_IDENTITY_TOKEN_FILE
ENV AWS_ROLE_ARN=$AWS_ROLE_ARN
ENV AWS_WEB_IDENTITY_TOKEN_FILE=$AWS_WEB_IDENTITY_TOKEN_FILE

and adding --build-arg into container's args

- '--build-arg="AWS_ROLE_ARN=$AWS_ROLE_ARN"'
- '--build-arg="AWS_WEB_IDENTITY_TOKEN_FILE=$AWS_WEB_IDENTITY_TOKEN_FILE"'
xavbourdeau commented 1 month ago

It is working for me. I'm runnning kaniko from a EKS pod with IRSA enabled. Here is the policy that is attached to the irsa role, using image gcr.io/kaniko-project/executor:v1.23.2-debug

Note, "ecr:GetAuthorizationToken" has to be set to "*" resource, rest of action can be set to a specific repo


{
    {
        "Effect": "Allow",
        "Action": "ecr:GetAuthorizationToken",
        "Resource": "*"
    },
    {
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:BatchGetImage",
        "ecr:CompleteLayerUpload",
        "ecr:DescribeImages",
        "ecr:DescribeImageScanFindings",
        "ecr:DescribeRepositories",
        "ecr:GetDownloadUrlForLayer",
        "ecr:GetLifecyclePolicy",
        "ecr:GetLifecyclePolicyPreview",
        "ecr:GetRepositoryPolicy",
        "ecr:InitiateLayerUpload",
        "ecr:ListImages",
        "ecr:ListTagsForResource",
        "ecr:PutImage",
        "ecr:UploadLayerPart"
      ],
      "Effect": "Allow",
      "Resource": "arn:aws:ecr:REGION:AWS_ACCOUNT_ID:repository/REPO_NAME"
    }
}