aws / karpenter-provider-aws

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
https://karpenter.sh
Apache License 2.0
6.69k stars 936 forks source link

Upgrading NodeClass via custom Helm Chart fails #6249

Closed Phylu closed 2 months ago

Phylu commented 4 months ago

We are installing the karpenter-crd helm chart. Then we have our own internally managed help chart that makes use of the CRDs to define an EC2NodeClass and multiple NodePools. When we do a change on our configuration, e.g. adding a tag to the EC2NodeClass, I'd expect this to be a minor change that can be easily applied.

However, when I apply the updated helm chart, helm tries to delete the EC2NodeClass and then recreate it with its new configuration. This fails as there are nodes currently running with that class and therefore the karpenter finalizer (luckily) intervenes. The EC2NodeClass with the updated configuration can't be created and I get the error message object is being deleted: ec2nodeclasses.karpenter.k8s.aws "default" already exists.

How is it possible that helm will just patch the existing EC2NodeClass instead of trying to recreate it?

engedaam commented 4 months ago

Can you share how you are applying the updateed helm chart for the EC2NodeClass? Are you using helm upgrade?

Phylu commented 4 months ago

@engedaam Sure. Here are some more details:

The helm chart is updated using terraform apply with the following code, which internally is supposed to do a helm upgrade API call. It will pick up the new version based on the version string in the chart.yaml:

resource "helm_release" "node_class" {
  name      = "node-class"
  chart     = "${path.module}/charts/node-class"

  depends_on = [
    helm_release.karpenter_crd # The helm chart for the karpenter CRDs
  ]
}

Inside the chart.yaml, the version number was changed.

-version: 0.1.0
+version: 0.1.1

-appVersion: "0.1.0"
+appVersion: "0.1.1"

In the helm chart itself in the corresponding template charts/node-class/templates/node-class.yaml only one tag called "Name" was added:

apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
...
spec:
  tags:
    Name: cluster-name
engedaam commented 4 months ago

From doing a search, helm does not offer a direct way to patch fields on resources. It would seem helm recommends using --post-renderer with tools like Kustomize to enable patching fields. https://helm.sh/docs/topics/advanced/#post-rendering

fmuyassarov commented 4 months ago

Hi @Phylu UPDATED: Would it be problematic in your case to make necessary customization to the chart prior to installing it? Perhaps that why you won't run into this problem.

Regarding your original problem, modifying the metadata part (like labels and annotations) of an object usually doesn't require the object to be recreated, while changing the spec part often does. This depends on the controller's reconciliation logic and whether it decides that patching the spec should result in object recreation. In this case, it's the Karpenter controller that determines what happens when the EC2NodeClass object is updated.

My point is that regardless of whether you use Helm, kubectl, or curl, when you modify certain fields of an object, the controller watching that resource decides the outcome of those changes. For the EC2NodeClass, I assume Karpenter will recreate the object if you change the spec.

You mentioned that you added a tag via kubectl, which is the same as doing it via Helm or any other method. Did you notice if the EC2NodeClass was recreated or if it was just patched? I would expect it was recreated rather than patched in place.

If that's true, then once finalizers are set, it might take some time for the nodes to be cleaned up. It's possible that a new EC2NodeClass creation request happened too soon, while the old one was still being processed. If I am not wrong, Helm doesn't have a built-in timeout mechanism to wait for the entire chart to be created and all resources to be ready. It likely just tries to create the resources, and if it can't, it fails. But I could be wrong.

fmuyassarov commented 4 months ago

And out of curiosity, since in your case patched helm chart consists of only tags, would not it be better to set them directly on the EC2NodeClass instead of doing creating patched helm chart and doing helm upgrade?

Phylu commented 4 months ago

@fmuyassarov Sorry, for the delayed answer. I had some issues with my test setup and finally managed to create a minimal test case to try the update again.

Helm Chart setup

First to satisfy your curiosity:

We are creating EKS clusters using terraform using the terraform-aws-eks module. Once the cluster is populated, we have an internal terraform module that is configuring the cluster for our needs. E.g. it is installing the karpenter and karpenter-crd helm charts using the helm_release terraform resource as well as the terraform-aws-modules/eks/aws//modules/karpenter terraform module.

In order to finalize the setup, we need to have our karpenter custom resources (EC2NodeClass as well as NodePool) deployed to the cluster. We have created an internal helm chart for this which is also installed using the helm_release terraform resource. The other alternative to manage these resources would be using the terraform kubectl provider, which does not provide proper drift detection on the resources. Therefore, we decided to have these custom resources encapsulated in a helm chart. If there is a better way to manage these resources, I am happy for any ideas.

Test Case

That being said, here is the result of my continued tests:

Patching the EC2NodeClass manually

This works as expected (adding the "Name" tag):

~ kubectl patch EC2NodeClass default -p '{"spec":{"tags":{"karpenter.k8s.aws/cluster": "cluster-name", "karpenter.sh/discovery": "cluster-name", "Name": "cluster-name"}}}' --type=merge
ec2nodeclass.karpenter.k8s.aws/default patched

Change using kubectl apply

Writing out the existing config to a yaml file and then applying it for the update works as well.

$ kubectl get Ec2NodeClass default -o yaml > default.yaml

Manually adding the following line to the resulting yaml under the "tag" key: Name: eks-karpenter-aws

$ kubectl apply -f default.yaml
ec2nodeclass.karpenter.k8s.aws/default configured

Updating via helm chart upgrade

The helm chart upgrade tries to recreate the EC2NodeClass. The NodeClass can't be deleted as it will wait for node termination:

Waiting on NodeClaim termination

Therefore producing the error:

│ Error: post-upgrade hooks failed: warning: Hook post-upgrade karpenter-crds/templates/node-class.yaml failed: 1 error occurred:
│   * object is being deleted: ec2nodeclasses.karpenter.k8s.aws "default" already exists
Phylu commented 3 months ago

@fmuyassarov: Do you think that there is a solution for this?

We now implemented our workaround where we are rotating the Name of the EC2NodeClass on any other change to prevent the deadlock and it seems to be working as expected as far as I can tell.

fmuyassarov commented 3 months ago

@fmuyassarov: Do you think that there is a solution for this?

We now implemented our workaround where we are rotating the Name of the EC2NodeClass on any other change to prevent the deadlock and it seems to be working as expected as far as I can tell.

Hi @Phylu , Sorry for radio silence. At this point I was a bit stuck. I went to do a bit of search but I don't find anything concrete. I still think that this is Helm issue and not Karpenter necessarily. I went through https://github.com/helm/helm/issues/5430 which is somehow similar (but not directly) but again there is nothing concrete that I could come up with unfortunately.

github-actions[bot] commented 3 months ago

This issue has been inactive for 14 days. StaleBot will close this stale issue after 14 more days of inactivity.