aws / amazon-eks-pod-identity-webhook

Amazon EKS Pod Identity Webhook
Apache License 2.0
614 stars 175 forks source link

Cannot use wildcard (*) namespace in kops when using IRSA #237

Open SohamChakraborty opened 2 months ago

SohamChakraborty commented 2 months ago

What happened: We are trying to use wildcard namespace feature in kops that came up with this PR https://github.com/kubernetes/kops/pull/16113. Now using wildcard namespace in kops cluster manifest and then trying to create a pod that references the service account and IAM policy fails with this particular error in pod-identity-webhook logs:

I0821 08:42:55.226946       1 handler.go:395] Pod was not mutated. Reason: Service account did not have the right annotations or was not found in the cache. Pod=ssm-ec2-test, ServiceAccount=ssm-ec2, Namespace=default

What you expected to happen: Pod to be mutated and contain the required policy/role.

How to reproduce it (as minimally and precisely as possible): in kops cluster manifest, we have this:

spec:
  iam:
    allowContainerRegistry: true
    legacy: false
    serviceAccountExternalPermissions:
    - name: ssm-ec2
      aws:
        policyARNs:
          - arn:aws:iam::<ACCOUNT_ID>:policy/access-ec2-with-ssm
      namespace: "*"

Then we try to deploy an workload:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: ssm-ec2
  namespace: default
---
apiVersion: v1
kind: Pod
metadata:
  name: ssm-ec2-test
  namespace: default
spec:
  containers:
  - name: aws-cli
    image: amazon/aws-cli:latest
    command:
    - sleep
    - "30000"
  serviceAccountName: "ssm-ec2"

pod-identity-webhook complains with:

I0821 08:42:54.833148       1 cache.go:179] Adding SA default/ssm-ec2 to SA cache: &{RoleARN: Audience: UseRegionalSTS:false TokenExpiration:0}
I0821 08:42:54.833397       1 cache.go:179] Adding SA default/ssm-ec2 to SA cache: &{RoleARN: Audience: UseRegionalSTS:false TokenExpiration:0}
I0821 08:42:55.226659       1 cache.go:80] Fetching sa default/ssm-ec2 from cache
I0821 08:42:55.226847       1 cache.go:93] Service account default/ssm-ec2 not found in cache
I0821 08:42:55.226946       1 handler.go:395] Pod was not mutated. Reason: Service account did not have the right annotations or was not found in the cache. Pod=ssm-ec2-test, ServiceAccount=ssm-ec2, Namespace=default

Anything else we need to know?: When we change the "*" to any namespace (default) everything works just fine as expected.

Environment:

SohamChakraborty commented 2 months ago

Discussed this with @olemarkus in #kops-users slack channel and he feels that https://github.com/aws/amazon-eks-pod-identity-webhook/blob/master/pkg/cache/cache.go#L130 needs to check for both namespace + name and "*" + name

EDIT: I can provide the full cluster spec after redacting sensitive parts if needed.

hakman commented 2 months ago

@kmala Do you know if there's anyone that could take a look at this? Thanks!

kmala commented 2 months ago

the changes looks small as we want to support wild card for all namespaces and don't see any issue with supporting this. let me check if any one can work on it

hakman commented 2 months ago

Awesome, thanks for checking!

olemarkus commented 2 months ago

I can probably do the PR as well, but it will take a few days before I can find the time.