aws / karpenter-provider-aws

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
https://karpenter.sh
Apache License 2.0
6.73k stars 945 forks source link

Upgrade to v1.0.4 breaks kubelet config using EC2NodeClass's spec.kubelet.clusterDNS #7235

Open schahal opened 3 days ago

schahal commented 3 days ago

Description

Observed Behavior:

After upgrading from karpenter-provider-aws:v1.0.3 to v1.0.4, the new Kubernetes nodes that Karpenter provisions does not have the EC2NodeClass's spec.kubelet.clusterDNS value in its /etc/kubernetes/kubelet/config.

For example, one of our EC2NodeClass's (e.g., default-ebs) looks like this:

Click to view EC2NodeClass ```yaml apiVersion: karpenter.k8s.aws/v1 kind: EC2NodeClass metadata: annotations: karpenter.k8s.aws/ec2nodeclass-hash: "" karpenter.k8s.aws/ec2nodeclass-hash-version: v3 karpenter.sh/stored-version-migrated: "true" finalizers: - karpenter.k8s.aws/termination generation: 18 name: default-ebs spec: amiFamily: Bottlerocket amiSelectorTerms: - alias: bottlerocket@1.24.1 ... kubelet: clusterDNS: - 169.254.20.11 evictionHard: imagefs.available: 30% imagefs.inodesFree: 15% ... evictionSoft: ... ```

Notice, spec.kubelet.clusterDNS: [169.254.20.11]

However, on the new node, we see:

$ grep -A1 "clusterDNS" /.bottlerocket/rootfs/etc/kubernetes/kubelet/config
clusterDNS:
- 172.20.0.10
Also confirmed the NodePool the node is part of refers to the same EC2NodeClass as above ``` Node Class Ref: Group: karpenter.k8s.aws Kind: EC2NodeClass Name: default-ebs ```

Expected Behavior:

After reverting back to v1.0.3, we see the correct (expected) value on a new node:

$ grep -A1 "clusterDNS" /.bottlerocket/rootfs/etc/kubernetes/kubelet/config
clusterDNS:
- 169.254.20.11

Reproduction Steps (Please include YAML):

See above:

  1. Run karpenter-provider-aws:v1.0.3
  2. Make sure to have an EC2NodeClass which includes a custom spec.kubelet.clusterDNS
  3. Confirm /etc/kubernetes/kubelet/config on node has that same value
  4. Upgrade to karpenter-provider-aws:v1.0.4
  5. Repeat steps (2)-(3) and you'll see that the config file on the new node does not have that value

Versions:

schahal commented 3 days ago

A bit more clarity:

Prior to v1.0.4, Karpenter was able to take spec.kubelet.clusterDNS of the EC2NodeClass and overlay the kubelet config of that value.

With v1.0.4+, it's keeping the default value we pass into the node user-data:

[settings]
...
[settings.kubernetes]
...
cluster-dns-ip = '172.20.0.10'
max-pods = 29
...

So it's not just clusterDNS (e.g., maxPods is kept at 29 also).

Looking at v1.0.4 release notes, there was a maxPods-related commit (https://github.com/aws/karpenter-provider-aws/pull/7020), but does that vibe with the symptoms of this issue?

engedaam commented 2 days ago

Do you have compatibility.karpenter.sh/v1beta1-kubelet-conversion annotation in any of your nodepools? The compatibility.karpenter.sh/v1beta1-kubelet-conversion NodePool annotation takes precedence over the EC2NodeClass Kubelet configuration when launching nodes

schahal commented 2 days ago

Indeed, looking at the nodepools, they have annotations like this (though not explicitly added (edit: by us), so probably by Karpenter itself when we migrated to v1.0):

compatibility.karpenter.sh/v1beta1-kubelet-conversion: {"kubeReserved":{"cpu":"90m","ephemeral-storage":"1Gi","memory":"1465Mi"}}
compatibility.karpenter.sh/v1beta1-nodeclass-reference: {"name":"default-ebs"}

Couple questions:

engedaam commented 2 days ago

Is the suggestion here that we remove those annotations and retry the upgrade?

  • The values of those annotations seem like unrelated configs (?)
    • it also seems like the docs suggest we remove them during an eventual jump to v1.1.x (here we're going to v1.0.4)

The nodepool compatibility.karpenter.sh/v1beta1-kubelet-conversion will take precedence over the over the EC2NodeClass Kubelet configuration. At v1.1 the expectation is that the annotation is removed by customers.

The compatibility.karpenter.sh/v1beta1-kubelet-conversion NodePool annotation takes precedence over the EC2NodeClass Kubelet configuration when launching nodes. Remove the kubelet-configuration annotation (compatibility.karpenter.sh/v1beta1-kubelet-conversion) from your NodePools once you have migrated kubelet from the NodePool to the EC2NodeClass.

ref: https://karpenter.sh/docs/upgrading/v1-migration/#before-upgrading-to-v11

Wondering why it's only breaking when we jump from v1.0.3 to v1.0.4 (and not, for example, when we jumped from v1.0.2 to v1.0.3?

We had a bug that any new nodeclaim that was launched used the EC2NodeClass kubelet configuration, without considering the kubelet compatibility annotation. This fix was merged in at 1.0.2: https://github.com/kubernetes-sigs/karpenter/pull/1667. Are you able to share node that were created on 1.0.3 with their nodepools and ec2nodeclass kubelet configurations?

schahal commented 2 days ago

Are you able to share node that were created on 1.0.3 with their nodepools and ec2nodeclass kubelet configurations?

Yes, here's a slightly anonymized share of that:

$ kubectl describe nodeclaim foo-bar-9lmsn ``` Name: foo-bar-9lmsn Namespace: Labels: karpenter.sh/nodepool=foo-bar Annotations: compatibility.karpenter.k8s.aws/cluster-name-tagged: true compatibility.karpenter.k8s.aws/kubelet-drift-hash: karpenter.k8s.aws/ec2nodeclass-hash: karpenter.k8s.aws/ec2nodeclass-hash-version: v3 karpenter.k8s.aws/tagged: true karpenter.sh/nodepool-hash: karpenter.sh/nodepool-hash-version: v3 API Version: karpenter.sh/v1 Kind: NodeClaim Metadata: Creation Timestamp: 2024-10-17T11:11:31Z Finalizers: karpenter.sh/termination Generate Name: foo-bar- Generation: 1 Owner References: API Version: karpenter.sh/v1 Block Owner Deletion: true Kind: NodePool Name: foo-bar UID: Resource Version: UID: Spec: Expire After: 720h Node Class Ref: Group: karpenter.k8s.aws Kind: EC2NodeClass Name: default-foobar Requirements: Key: karpenter.sh/nodepool Operator: In Values: foo-bar Resources: Requests: Cpu: 1235m Memory: 4523Mi Pods: 18 Taints: ```
$ kubectl describe nodepool foo-bar ``` Name: foo-bar Namespace: Labels: Annotations: compatibility.karpenter.sh/v1beta1-kubelet-conversion: {"kubeReserved":{"cpu":"90m","ephemeral-storage":"1Gi","memory":"1465Mi"}} compatibility.karpenter.sh/v1beta1-nodeclass-reference: {"name":"default-foobar"} karpenter.sh/nodepool-hash: karpenter.sh/nodepool-hash-version: v3 karpenter.sh/stored-version-migrated: true API Version: karpenter.sh/v1 Kind: NodePool Metadata: Creation Timestamp: 2024-07-25T00:20:04Z Generation: 10 Resource Version: UID: Spec: Disruption: Budgets: Nodes: 5% Reasons: Drifted Nodes: 25% Consolidate After: 5m Consolidation Policy: WhenEmptyOrUnderutilized Template: Metadata: Labels: Spec: Expire After: 720h Node Class Ref: Group: karpenter.k8s.aws Kind: EC2NodeClass Name: default-foobar Requirements: Startup Taints: Taints: ```
$ kubectl describe ec2nodeclass default-foobar ``` Name: default-foobar Namespace: Labels: Annotations: karpenter.k8s.aws/ec2nodeclass-hash: karpenter.k8s.aws/ec2nodeclass-hash-version: v3 karpenter.sh/stored-version-migrated: true API Version: karpenter.k8s.aws/v1 Kind: EC2NodeClass Metadata: Creation Timestamp: 2024-07-21T22:03:32Z Finalizers: karpenter.k8s.aws/termination Generation: 18 Resource Version: UID: Spec: Ami Family: Bottlerocket Ami Selector Terms: Alias: bottlerocket@1.24.1 Block Device Mappings: Device Name: /dev/xvda Ebs: Instance Profile: Kubelet: Cluster DNS: 169.254.20.11 Eviction Hard: imagefs.available: 30% imagefs.inodesFree: 15% memory.available: 5% nodefs.available: 10% nodefs.inodesFree: 10% pid.available: 30% Image GC High Threshold Percent: 75 Image GC Low Threshold Percent: 45 Max Pods: 110 User Data: [settings.kubernetes] [settings.kernel.sysctl] [settings.bootstrap-containers.setup-runtime-storage] ```

To re-iterate the behavior for complete understanding:

schahal commented 2 days ago

We had a bug that any new nodeclaim that was launched used the EC2NodeClass kubelet configuration, without considering the kubelet compatibility annotation. This fix was merged in at 1.0.2: https://github.com/kubernetes-sigs/karpenter/pull/1667

I believe this is the issue (and why we only see this issue when we upgrade to karpenter-provider-aws:v1.0.4)

Looking at all commit differences from karpenter-provider-aws v1.0.3 and v1.0.4, it looks like the fix that was merged in https://github.com/kubernetes-sigs/karpenter/pull/1667 was only pulled finally into karpenter-provider-aws:v1.0.4:

Screenshot 2024-10-17 at 11 27 39 AM

If I'm reading that correctly, what does that mean? Do karpenter users who upgrade from karpenter-provider-aws:v1.0.3 to karpenter-provider-aws:v1.0.4 need to manually remove that kubelet compatibility annotation from all their NodePools (regardless of what value that annotation has)?