aws / karpenter-provider-aws

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
https://karpenter.sh
Apache License 2.0
6.73k stars 945 forks source link

EC2 Instances configuration messed up when spawned by Karpenter, they fail to register to the cluster #6529

Open nelloserio opened 3 months ago

nelloserio commented 3 months ago

Description

Description

Observed Behavior: Hi folks! I'm having troubles in spawning well-configured EC2 instances leveraging Karpenter. Karpenter is correctly deployed into the EKS cluster and configured with a NodePool and an EC2NodeClass.

Their definitions are pretty simple:

Whenever there are Pods in Pending state, Karpenter correctly spawns an EC2 instance and no ERRORS are shown from its logs.

The problem is that spawned instances are not configured correctly and therefore the kubelet doesn't start. There is no configuration regarding the CNI (Cilium) under /etc/cni/net.d and the /etc/eks folder is empy:

Screenshot 2024-07-17 at 12 56 47 Screenshot 2024-07-17 at 12 57 03

Running the kubelet it seems that it tries to use containerd as CR but it doesn't even find it:

Screenshot 2024-07-17 at 12 57 42 Screenshot 2024-07-17 at 12 57 51

Outcome: the EC2 instance doesn't get registered to the cluster and, indeed, it doesn't get displayed as cluster node.

I'm using the same AMI used by the Nodegroup already attached to the EKS Cluster, so I expect the same configuration is applied.

Any suggestions or guidelines to how to troubleshoot it?

Expected Behavior: I expect that once Karpenter spawn an EC2 instance, it's gonna be configured and initialised correctly, and therefore joined to the cluster correctly.

Reproduction Steps (Please include YAML):

  1. Create an EKS cluster with Cilium CNI and with a Nodegroup attached to it
  2. Deploy karpenter using helm
  3. Deploy NodePool and NodeClass as described above
  4. Create a number of pods that requires 1 or more nodes to be scheduled on. In this way Karpenter should spawn at least one more EC2 instance
  5. Check wether the EC2 instance is configured correctly and joins the cluster

Versions:

engedaam commented 2 months ago

Have you checked out the Karpetner FAQs? https://karpenter.sh/docs/troubleshooting/#node-launchreadiness