jtblin / kube2iam

kube2iam provides different AWS IAM roles for pods running on Kubernetes
BSD 3-Clause "New" or "Revised" License
1.99k stars 318 forks source link

IMDSv2 and Karpenter ec2nodeclass #382

Open kyrylyuk-andriy opened 1 week ago

kyrylyuk-andriy commented 1 week ago

hello kube2iam community, we recently migrated our workloads to ec2 instances managed by Karpenter NodePool, in ec2nodeclass (launch template) IMDSv2 is enabled by default and we see 401 response codes in kube2iam log output. Several examples

level=info msg="GET /latest/meta-data/hostname (401) level=info msg="GET /latest/dynamic/instance-identity/document/ (401)

but interesting thing that in the same time i see also 200 responses, for example

level=info msg="GET /latest/meta-data/instance-id (200)

manually modifying EC2 instance in AWS console instance metadata options and disabling IMDSv2 resolves issue so seems like related to IMDSv2.

Any specific recommendations how to setup kube2iam daemonset to compatible with IMDSv2 ? thank you.

Name:           kube2iam
Selector:       app.kubernetes.io/instance=kube2iam,app.kubernetes.io/name=kube2iam
Node-Selector:  <none>
Labels:         app.kubernetes.io/instance=kube2iam
                app.kubernetes.io/managed-by=Helm
                app.kubernetes.io/name=kube2iam
                argocd.argoproj.io/instance=kube2iam
                helm.sh/chart=kube2iam-2.6.0
Annotations:    deprecated.daemonset.template.generation: 21
Desired Number of Nodes Scheduled: 10
Current Number of Nodes Scheduled: 10
Number of Nodes Scheduled with Up-to-date Pods: 10
Number of Nodes Scheduled with Available Pods: 10
Number of Nodes Misscheduled: 0
Pods Status:  10 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           app.kubernetes.io/instance=kube2iam
                    app.kubernetes.io/name=kube2iam
  Service Account:  kube2iam
  Containers:
   kube2iam:
    Image:      jtblin/kube2iam:0.11.2
    Port:       8181/TCP
    Host Port:  0/TCP
    Args:
      --host-interface=eni+
      --node=$(NODE_NAME)
      --host-ip=$(HOST_IP)
      --iptables=true
      --base-role-arn=ommited
      --app-port=8181
      --metrics-port=8181
    Liveness:  http-get http://:8181/healthz delay=30s timeout=1s period=5s #success=1 #failure=3
    Environment:
      HOST_IP:              (v1:status.podIP)
      NODE_NAME:            (v1:spec.nodeName)
      AWS_DEFAULT_REGION:  us-east-1
    Mounts:                <none>
  Volumes:                 <none>
  Priority Class Name:     system-node-critical
  Node-Selectors:          <none>
  Tolerations:             :NoSchedule op=Exists
rknightion commented 4 days ago

We experienced the same as yourself and ultimately had to change our nodeclass back to

  metadataOptions:
    httpEndpoint: enabled
    httpProtocolIPv6: disabled
    httpPutResponseHopLimit: 5
    httpTokens: optional

To get things working again. Security team doesn't like it so if you found a way to actually use v2 please do shout!