Open genseb13011 opened 1 week ago
Can you provide you nodepool and EC2NodeClass configuration? Also have you followed our troubleshooting guide around this issue? https://karpenter.sh/docs/troubleshooting/#cni-is-unable-to-allocate-ips-to-pods
Thanks for your answer.
Please, find below nodepool and ec2nodeclass configurations.
Nodepool configuration
apiVersion: karpenter.sh/v1
kind: NodePool
name: knodes-app-spot-nodepool
spec:
disruption:
budgets:
- nodes: 10%
consolidateAfter: 2m
consolidationPolicy: WhenEmptyOrUnderutilized
template:
metadata: {}
spec:
expireAfter: 720h0m0s
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: default
requirements:
- key: karpenter.k8s.aws/instance-hypervisor
operator: In
values:
- nitro
- key: kubernetes.io/arch
operator: In
values:
- arm64
- amd64
- key: nodegroup
operator: In
values:
- knodes-app
- key: karpenter.k8s.aws/instance-category
operator: In
values:
- r
- x
- key: karpenter.sh/capacity-type
operator: In
values:
- spot
- key: topology.kubernetes.io/zone
operator: In
values:
- eu-west-1c
- key: karpenter.k8s.aws/instance-cpu
operator: Gt
values:
- "2"
- key: karpenter.k8s.aws/instance-cpu
operator: Lt
values:
- "33"
- key: kubernetes.io/os
operator: In
values:
- linux
weight: 100
EC2NodeClass:
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
name: default
spec:
amiSelectorTerms:
- alias: al2@latest
blockDeviceMappings:
- deviceName: /dev/xvda
ebs:
deleteOnTermination: true
encrypted: true
kmsKeyID: arn:aws:kms:xxxxx:xxxxxxxxx:key/xxxxxxxxxxxxxxxxxx
volumeSize: 200Gi
volumeType: gp3
instanceProfile: nodes-karpenter-NodeInstanceProfile
kubelet:
evictionHard:
memory.available: 2%
nodefs.available: 10%
nodefs.inodesFree: 5%
evictionSoft:
memory.available: 3%
evictionSoftGracePeriod:
memory.available: 2m0s
podsPerCore: 12
systemReserved:
memory: 300Mi
metadataOptions:
httpEndpoint: enabled
httpProtocolIPv6: disabled
httpPutResponseHopLimit: 2
httpTokens: required
securityGroupSelectorTerms:
- id: sg-xxxxxxxxxxxxxxxxx
subnetSelectorTerms:
- id: subnet-xxxxxxxxxxxxxxx
- id: subnet-xxxxxxxxxxxxxxx
- id: subnet-xxxxxxxxxxxxxxx
N.B: I've added "podsPerCore" to "fix" the issue temporarly
Yes I've read the troubleshooting section but:
Thanks again
I confirm that we don't use "Security Groups per Pod" feature.
Another thing to mention is that, when the last issue occurs:
so pods number and IPs number were not aligned (don't know if this behaviour is "normal")
Seb.
I'm adding information about my issue:
Even with my "podsPerCore: 12" I'm still facing an issue.
The instance type is "r5b.8xlarge" (240 IPs max.)
Same behaviour: only 157 pods assigned to it and 232 secondary IP + 8 "primary".
Description
Observed Behavior:
Any pods are stuck in "ContainerCreating" status with below error:
Warning FailedCreatePodSandBox 2m47s (x1642 over 6h) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "71ef22cdaf65163adca4c97ed66df6a7cdcdcbe7c011d0ff62a77648cba5b46b": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
Expected Behavior:
Karpenter should detects that max allowed IP is reached on the node and provision a new one.
Reproduction Steps:
Versions:
kubectl version
): EKS 1.28Other informations: