GoogleContainerTools / skaffold

Easy and Repeatable Kubernetes Development
https://skaffold.dev/
Apache License 2.0
15.06k stars 1.62k forks source link

skaffold build in cluster with AWS EKS and ECR #7065

Open ITD27M01 opened 2 years ago

ITD27M01 commented 2 years ago

I want to develop a workflow when developer makes changes to the code and this code continuously deployed to the test Kubernetes AWS EKS cluster.

Images are building in the EKS cluster by Kaniko because we are using ARM flavors.

Skaffold Looks like a good solution for such workflow, but I could not manage to start it to work in my environment.

I'm using the following skaffold.yaml:

---
apiVersion: skaffold/v2beta26
kind: Config
metadata:
  name: releases
build:
  artifacts:
    - image: <account id>.dkr.ecr.<region>.amazonaws.com/releases
      kaniko:
        cache:
          repo: <account id>.dkr.ecr.<region>.amazonaws.com/cache
  cluster:
    dockerConfig:
      secretName: skaffold-docker-config

skaffold build fails to run:


    > skaffold build                                                                                                                             
    Generating tags...
     - <account id>.dkr.ecr.<region>.amazonaws.com/releases -> <account id>.dkr.ecr.<region>.amazonaws.com/releases:1ef2c39-dirty
    Checking cache...
     - <account id>.dkr.ecr.<region>.amazonaws.com/releases: Not found. Building
    Starting build...
    Creating docker config secret [skaffold-docker-config]...
    checking for existing kaniko secret: Unauthorized

From the trace logs I see the 401 error for my registry:


    TRAC[0000] Checking base image <account id>.dkr.ecr.<region>.amazonaws.com/releases:arm64-v0.1.5 for ONBUILD triggers.  subtask=-1 task=DevLoop
    TRAC[0000] --> GET https://<account id>.dkr.ecr.<region>.amazonaws.com/v2/ 
    TRAC[0000] GET /v2/ HTTP/1.1                            
    TRAC[0000] Host: <account id>.dkr.ecr.<region>.amazonaws.com 
    TRAC[0000] User-Agent: Go-http-client/1.1               
    TRAC[0000] Accept-Encoding: gzip                        
    TRAC[0000]                                              
    TRAC[0000]                                              
    TRAC[0000] <-- 401 https://<account id>.dkr.ecr.<region>.amazonaws.com/v2/ (189.277611ms) 
    TRAC[0000] HTTP/1.1 401 Unauthorized                    
    TRAC[0000] Content-Length: 15                           
    TRAC[0000] Content-Type: text/plain; charset=utf-8      
    TRAC[0000] Date: Mon, 31 Jan 2022 08:00:51 GMT          
    TRAC[0000] Docker-Distribution-Api-Version: registry/2.0 
    TRAC[0000] Sizes:                                       
    TRAC[0000] Www-Authenticate: Basic realm="https://<account id>.dkr.ecr.<region>.amazonaws.com/",service="ecr.amazonaws.com" 
    TRAC[0000]                                              
    TRAC[0000] Not Authorized                               

But I can pull images on my laptop well.

There are other GET requests in the trace which is successful:


    TRAC[0000] --> GET https://<account id>.dkr.ecr.<region>.amazonaws.com/v2/releases/manifests/arm64-v0.1.5 
    TRAC[0000] GET /v2/releases/manifests/arm64-v0.1.5 HTTP/1.1 
    TRAC[0000] Host: <account id>.dkr.ecr.<region>.amazonaws.com 
    TRAC[0000] User-Agent: go-containerregistry/v0.7.0      
    TRAC[0000] Accept: application/vnd.docker.distribution.manifest.v1+json,application/vnd.docker.distribution.manifest.v1+prettyjws,application/vnd.docker.distribution.manifest.v2+json,application/vnd.oci.image.manifest.v1+json,application/vnd.docker.distribution.manifest.list.v2+json,application/vnd.oci.image.index.v1+json 
    TRAC[0000] Authorization: <redacted>                    
    TRAC[0000] Accept-Encoding: gzip                        
    TRAC[0000]                                              
    TRAC[0000]                                              
    TRAC[0000] <-- 200 https://<account id>.dkr.ecr.<region>.amazonaws.com/v2/releases/manifests/arm64-v0.1.5 (101.445121ms) 
    TRAC[0000] HTTP/1.1 200 OK                              
    TRAC[0000] Content-Length: 1723                         
    TRAC[0000] Content-Type: application/vnd.docker.distribution.manifest.v2+json 
    TRAC[0000] Date: Mon, 31 Jan 2022 08:00:51 GMT          
    TRAC[0000] Docker-Distribution-Api-Version: registry/2.0

On the AWS EKS Cluster side I'm using the instance principal for the ECR auth and such job for kaniko build works well:


    ---
    apiVersion: batch/v1
    kind: Job
    metadata:
      labels:
        controller-uid: 27ec2141-e691-4217-940d-02dc87b894cc
        job-name: releases-job
      name: releases-job
    spec:
      selector:
        matchLabels:
          controller-uid: 27ec2141-e691-4217-940d-02dc87b894cc
      template:
        metadata:
          labels:
            controller-uid: 27ec2141-e691-4217-940d-02dc87b894cc
            job-name: releases-job
        spec:
          containers:
          - args:
            - --context=$(CONTEXT)
            - --dockerfile=$(DOCKERFILE_LOCATION)
            - --destination=<account id>.dkr.ecr.<region>.amazonaws.com/$(REPO):$(TAG)
            - --cache-repo=<account id>.dkr.ecr.<region>.amazonaws.com/cache
            - --cache=true
            env:
            - name: CONTEXT
              value: git://github.com/account/releases.git
            - name: DOCKERFILE_LOCATION
              value: docker/Dockerfile
            - name: REPO
              value: releases
            - name: TAG
              value: arm64-v0.1.5
            image: gcr.io/kaniko-project/executor:latest
            name: kaniko
            volumeMounts:
            - mountPath: /kaniko/.docker/
              name: docker-config
          volumes:
          - configMap:
              defaultMode: 420
              name: releases-docker-config-c28k6bh5tm
            name: docker-config

    ---
    apiVersion: v1
    data:
      config.json: '{ "credsStore": "ecr-login" }'
    kind: ConfigMap
    metadata:
      name: releases-docker-config-c28k6bh5tm
tejal29 commented 2 years ago

Thanks you for the issue. We would appreciate any help from the community on this issue. Hopefully someone from the community using AWS has better insights. Please do ask on skaffold slack channel https://app.slack.com/client/T09NY5SBT/CABQMSZA6

bribroder commented 9 months ago

Probably this issue is a bit stale, but in case other EKS + ECR users want to use skaffold + kaniko, this now works pretty seamlessly using a service account for privileges and no extra secrets configuration is necessary with the new Pod Identity Associations https://docs.aws.amazon.com/eks/latest/userguide/pod-identities.html