Closed Tasmana-banana closed 2 years ago
Simple test deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 1
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
#nodeSelector:
#karpenter.sh/capacity-type: on-demand
#node.kubernetes.io/instance-type: t3a.medium
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.2
resources:
requests:
cpu: 1
Hello @Tasmana-banana , Do you have any DaemonSets running in this cluster which are also requesting CPU resources?
Karpenter will take into account the resources required by DaemonSets when scheduling pods. If the sum of DaemonSet and Pod resources is greater than what's available in a t3a.medium then Karpenter will not be able to launch a node.
Thanks for the answer @dewjam No, it's new claster eks.
Outputs: kubectl get daemonsets -A
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system aws-node 3 3 3 3 3 <none> 11h
kube-system kube-proxy 3 3 3 3 3 <none> 11h
kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
default inflate-59664786cf-4ng7k 1/1 Running 0 91m
default inflate-59664786cf-hjg7k 1/1 Running 0 27m
default inflate-59664786cf-rb5dz 0/1 Pending 0 117s
karpenter karpenter-577fb865d7-wjpnp 2/2 Running 0 117s
kube-system aws-node-8fxct 1/1 Running 0 25m
kube-system aws-node-klbm7 1/1 Running 0 11h
kube-system aws-node-t6pp2 1/1 Running 0 11h
kube-system coredns-86d9946576-f7ns8 1/1 Running 0 11h
kube-system coredns-86d9946576-ts4jc 1/1 Running 0 11h
kube-system kube-proxy-jm8fr 1/1 Running 0 25m
kube-system kube-proxy-kgdr9 1/1 Running 0 11h
kube-system kube-proxy-shxnz 1/1 Running 0 11h
kube-system metrics-server-6594d67d48-s8vb7 1/1 Running 0 9h
Would you mind providing the output of the below commands as well?
kubectl get pods -A -o wide
@dewjam
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default inflate-59664786cf-4ng7k 1/1 Running 0 16h 10.65.5.38 ip-10-65-5-129.us-west-2.compute.internal <none> <none>
default inflate-59664786cf-5fpfh 0/1 Pending 0 106s <none> <none> <none> <none>
default inflate-59664786cf-wcd4w 1/1 Running 0 107s 10.65.8.230 ip-10-65-9-49.us-west-2.compute.internal <none> <none>
karpenter karpenter-577fb865d7-wjpnp 2/2 Running 0 15h 10.65.7.154 ip-10-65-7-49.us-west-2.compute.internal <none> <none>
kube-system aws-node-klbm7 1/1 Running 0 26h 10.65.5.129 ip-10-65-5-129.us-west-2.compute.internal <none> <none>
kube-system aws-node-rwj2g 1/1 Running 0 68s 10.65.9.49 ip-10-65-9-49.us-west-2.compute.internal <none> <none>
kube-system aws-node-t6pp2 1/1 Running 0 26h 10.65.7.49 ip-10-65-7-49.us-west-2.compute.internal <none> <none>
kube-system coredns-86d9946576-f7ns8 1/1 Running 0 26h 10.65.6.8 ip-10-65-7-49.us-west-2.compute.internal <none> <none>
kube-system coredns-86d9946576-ts4jc 1/1 Running 0 26h 10.65.4.115 ip-10-65-5-129.us-west-2.compute.internal <none> <none>
kube-system kube-proxy-6zq94 1/1 Running 0 68s 10.65.9.49 ip-10-65-9-49.us-west-2.compute.internal <none> <none>
kube-system kube-proxy-kgdr9 1/1 Running 0 26h 10.65.5.129 ip-10-65-5-129.us-west-2.compute.internal <none> <none>
kube-system kube-proxy-shxnz 1/1 Running 0 26h 10.65.7.49 ip-10-65-7-49.us-west-2.compute.internal <none> <none>
kube-system metrics-server-6594d67d48-s8vb7 1/1 Running 0 24h 10.65.4.230 ip-10-65-5-129.us-west-2.compute.internal <none> <none>
Thanks @Tasmana-banana . Unfortunately, I'm not able to reproduce this problem in my test setup. Just so I fully understand the situation, I have a couple more questions.
You mentioned you upgraded Karpenter from 0.5.3 to 0.9.1 in this cluster. Did you follow the upgrade guide here?
Looking at the output above, I see three nodes. I assume at least one of them was created by a Managed Node Group. Were the other two launched by Karpenter? If so, were they launched by Karpenter 0.9.1?
Thanks for questions @dewjam
I set up a completely new cluster with two nodes in a node group and installed karpenter 0.9.1 Also, part playbook karpenter controller
kubernetes.core.helm:
name: '{{ karpenter_namespace }}'
create_namespace: true
release_namespace: '{{ karpenter_namespace }}'
chart_ref: 'karpenter/karpenter'
chart_version: '0.9.1'
release_values:
serviceAccount:
create: true
name: 'karpenter'
annotations:
eks.amazonaws.com/role-arn: 'arn:aws:iam::{{ account_id }}:role/EKS-Karpenter-Role'
clusterName: '{{ cluster_name }}'
clusterEndpoint: '{{ cluster_endpoint }}'
defaultProvisioner: false
aws:
defaultInstanceProfile: '{{ instance_profile }}'
No, the other two nodes are the Node group of my cluster.
OK, I see two of the inflate pods are running. Are they both running on the Managed Nodes? Or was one of those nodes launched by Karpenter?
ip-10-65-9-49.us-west-2.compute.internal looks like it was launched most recently, so I'm assuming this was launched by Karpenter, but wanted to be sure.
U right @dewjam ! ip-10-65-9-49.us-west-2.compute.internal launched by Karpenter, but the pods that I scaled do not migrate to it
Hey @Tasmana-banana . My apologies for the delayed response. Have you tried another instance type by chance? Given at least one node was launched, I'm doubtful there is a permissions issue at play.
Have you tried an instance type other than t3a.medium? For example, can you try a t3.medium or an m5a.large?
Hello @dewjam Sory me too I just ran out of ideas already.. Tried to add a small instance
Hey @Tasmana-banana , Just to confirm, were you able to try to add a different instance type to your provisioner spec?
Hey @Tasmana-banana, just following up on this issue. Were you able to find a workaround?
Hello @dewjam. Story, in my country war, i can't investigate any. U can close issuer. Maybe I write to you after months. Take care of yourself!
Our thoughts are with you @Tasmana-banana ! Be safe!
Feel free to re-open this whenever you're ready.
@dewjam Getting this issue
incompatible with provisioner "control-plane-egress", no instance type satisfied resources {"cpu":"600m","memory":"1152Mi","pods":"1"} and requirements karpenter.k8s.aws/instance-size NotIn [16xlarge 18xlarge 24xlarge 32xlarge 48xlarge and 5 others], kubernetes.io/arch In [amd64], project In [control-plane], intent In [egress-karpenter], nodegroup-name In [control-plane-egress], karpenter.sh/provisioner-name In [control-plane-egress], topology.kubernetes.io/zone In [us-east-1b us-east-1d us-east-1e], kubernetes.io/os In [linux], karpenter.k8s.aws/instance-family In [c5 c5d c5n c6a c6i and 16 others], karpenter.sh/capacity-type In [on-demand spot]
Any possible recommendations or solution i can try?
I have specified a subnet selector in AWS Node Template trying to use a private subnet with using subnet name subnetSelector:
Name: eks-egress-nat
Karpenter version: 0.21.1 EKS version: 1.23
Can you open a new issues @zakariais?
@ellistarn https://github.com/aws/karpenter/issues/3211 Created
Version
Karpenter: v0.9.1
Kubernetes: v1.20.0
I used to use version 5.3 of the carpenter, but I decided it was time to upgrade. After a lot of configuration fixes, I ran into an error that I can not overcome .. Please help in solving the problem, in which direction to look?
ansible template provisioner
try scale app to 3 pods, but have some errors:
upd. INstance created, but me tregered this error. Can someone suggest a finer setting?