Closed kstevensonnv closed 1 year ago
aws.reservedENIs
doesn't actually allocate any IPs. It gives you a way to tell Karpenter that you have configure the CNI you are using to assign an IP to the node from a different place than normal or to reserve it for some other use. If you're using VPC CNI, then you'd use this config https://github.com/aws/amazon-vpc-cni-k8s#aws_vpc_k8s_cni_custom_network_cfg.
Do you have a CNI that is configured to assign IPs from appropriately sized subnets?
Hi @bwagner5,
Can someone please have a look at the information below and point me in the right direction?
All pods scheduled on the new node launched by Karpenter are running as expected.
The pods scheduled on the initial nodes launched when the cluster was created have reached the limit of 29 for the instance type according to eni-max-pods.txt
The instance type used for the initial nodes is m7g.large
Running the max-pods-calculator.sh script shows without prefix delegation enabled the maximum limit for pods on that instance type is 29.
./max-pods-calculator.sh --instance-type m7g.large --cni-version 1.12.5-eksbuild.2
29
This matches up with the number of pods entering a 'Running' state.
kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-1-6-216.eu-west-1.compute.internal Ready <none> 3h33m v1.26.4-eks-597964d
ip-10-1-8-251.eu-west-1.compute.internal Ready <none> 3h33m v1.26.4-eks-597964d
ip-10-1-8-37.eu-west-1.compute.internal Ready <none> 8m7s v1.26.4-eks-597964d
kubectl get pods -A -owide --field-selector spec.nodeName=ip-10-1-6-216.eu-west-1.compute.internal | wc -l
111
kubectl get pods -A -owide --field-selector spec.nodeName=ip-10-1-6-216.eu-west-1.compute.internal | grep Running | wc -l
29
kubectl get pods -A -owide --field-selector spec.nodeName=ip-10-1-8-251.eu-west-1.compute.internal | grep Running | wc -l
29
Running the same script shows with prefix delegation enabled the maximum limit for pods on that instance type is 110.
./max-pods-calculator.sh --instance-type m7g.large --cni-version 1.12.5-eksbuild.2 --cni-prefix-delegation-enabled
110
This matches up with the 'Allocatable' section when describing each of the initial nodes, 'pods: 110'
Scaling nodes to 0 and launching new nodes exhibits the same issue, when a node of that instance type reaches 29 allocated IPv4 addresses, all new pods get stuck on the 'ContainerCreating' status.
As below, 'ENABLE_PREFIX_DELEGATION' is enabled and 'WARM_PREFIX_TARGET' is set for the VPC CNI.
Shouldn't this configuration allow all nodes to use prefix delegation and successfully schedule pods beyond the usual limit?
I deployed a new VPC and EKS cluster, there's nothing else running in either.
VPC configuration: VPC CIDR | 10.1.0.0/19 | |||
---|---|---|---|---|
AWS Region | eu-west-1 | |||
Addressable Hosts | 8178 | |||
Spare Capacity | 2046 | |||
Subnet Type | Subnet address | Range of addresses | Useable IPs | Hosts |
Public | 10.1.0.0/22 | 10.1.0.0 - 10.1.3.255 | 10.1.0.1 - 10.1.3.254 | 1022 |
Public | 10.1.4.0/22 | 10.1.4.0 - 10.1.7.255 | 10.1.4.1 - 10.1.7.254 | 1022 |
Public | 10.1.8.0/22 | 10.1.8.0 - 10.1.11.255 | 10.1.8.1 - 10.1.11.254 | 1022 |
Private | 10.1.12.0/22 | 10.1.12.0 - 10.1.15.255 | 10.1.12.1 - 10.1.15.254 | 1022 |
Private | 10.1.16.0/22 | 10.1.16.0 - 10.1.19.255 | 10.1.16.1 - 10.1.19.254 | 1022 |
Private | 10.1.20.0/22 | 10.1.20.0 - 10.1.23.255 | 10.1.20.1 - 10.1.23.254 | 1022 |
Spare | 10.1.24.0/21 | 10.1.24.0 - 10.1.31.255 | 10.1.24.1 - 10.1.31.254 | 2046 |
Kubernetes nodes: | Name | Status | Version | IP |
---|---|---|---|---|
ip-10-1-6-216.eu-west-1.compute.internal | Ready | v1.26.4-eks-597964d | 10.1.6.216 | |
ip-10-1-8-251.eu-west-1.compute.internal | Ready | v1.26.4-eks-597964d | 10.1.8.251 |
Subnet 10.1.0.0/22 with node 'ip-10-1-6-216.eu-west-1.compute.internal' has 998 available IPv4 addresses. Subnet 10.1.8.0/22 with node 'ip-10-1-8-251.eu-west-1.compute.internal' has 999 available IPv4 addresses.
I'm using the Amazon VPC CNI plugin for Kubernetes Amazon EKS add-on.
I have added these configuration values:
{
"env":{
"ENABLE_PREFIX_DELEGATION":"true",
"WARM_PREFIX_TARGET":"1"
}
}
I can see the configuration values are applied:
kubectl describe node | grep Allocatable -A 10 | grep pods
pods: 110
pods: 110
kubectl -n kube-system describe pod aws-node-77sh5 | grep -i prefix
ENABLE_PREFIX_DELEGATION: true
WARM_PREFIX_TARGET: 1
kubectl -n kube-system describe pod aws-node-sv8qb | grep -i prefix
ENABLE_PREFIX_DELEGATION: true
WARM_PREFIX_TARGET: 1
I have one pod running in a test namespace:
kubectl get pods
NAME READY STATUS RESTARTS AGE
ahoy-hello-world-79f545d68-zp6gm 1/1 Running 0 18m
AWSNodeTemplate:
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
name: default
spec:
subnetSelector:
karpenter.sh/discovery: ${cluster_name}
securityGroupSelector:
karpenter.sh/discovery: ${cluster_name}
tags:
karpenter.sh/discovery: ${cluster_name}
amiFamily: Bottlerocket
Provisioner:
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"]
limits:
resources:
cpu: 1000
providerRef:
name: default
ttlSecondsAfterEmpty: 30
Karpenter logs are uneventful.
I scaled up a deployment to 250 pods. Karpenter adds a new node successfully and pods get scheduled on it.
New Kubernetes node: | Name | Status | Version | IP |
---|---|---|---|---|
ip-10-1-8-85.eu-west-1.compute.internal | Ready | v1.26.4-eks-597964d | 10.1.11.1 |
The new node was deployed in subnet 10.1.8.0/22 which now has 924 IPv4 addresses available, a change of -75.
162 pods are stuck in the 'ContainerCreating' state.
kubectl get pods | grep ContainerCreating | wc -l
162
88 pods are in the 'Running' state.
kubectl get pods | grep Running | wc -l
88
Version
Karpenter Version: 0.28.0-rc.2
Kubernetes Version: v1.26
Expected Behavior
Raised under this now closed issue: Karpenter is not aware of the Custom Networking VPC CNI pod limit per node
@bwagner5 https://github.com/aws/karpenter/issues/2273#issuecomment-1551853677
Running 0.28.0-rc.2 with the option
aws.reservedENIs
set to 1 as described. Pods should be assigned an IP address.Actual Behavior
The pod fails to get assigned an IP address and does not start.
Steps to Reproduce the Problem
Scale a deployment beyond what current nodes can handle to ensure a new node is provisioned by Karpenter.
Resource Specs and Logs
Community Note