aws / karpenter-provider-aws

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
https://karpenter.sh
Apache License 2.0
6.82k stars 960 forks source link

[Question] Karpenter doesn't scale out #3995

Closed hosonfung closed 1 year ago

hosonfung commented 1 year ago

Hi all,

I deployed a karpenter in my eks and tried to scale out my cluster. However, I did success one time. After that, the karpenter never scale out again. Here is my yaml file.

apiVersion:` karpenter.sh/v1alpha5 kind: Provisioner metadata: name: karpenterprovisioner namespace: karpenter spec: labels: app: karpenterprovisioner requirements:

  • key: node.kubernetes.io/instance-type operator: In values: ["c5a.2xlarge"]
  • key: topology.kubernetes.io/zone operator: In values: ["ap-east-1a"] kubeletConfiguration: maxPods: 110 imageGCHighThresholdPercent: 30 imageGCLowThresholdPercent: 25 providerRef: name: default ttlSecondsUntilExpired: 2592000 ttlSecondsAfterEmpty: 30

apiVersion: karpenter.k8s.aws/v1alpha1 kind: AWSNodeTemplate metadata: name: default namespace: karpenter spec: subnetSelector: karpenter.sh/discovery: mycluster securityGroupSelector: karpenter.sh/discovery: mycluster tags: name: "karpenter-provisioned" status: subnets:

  • id: subnet-xxxxx zone: ap-east-1a
    securityGroups:
  • id: sg-xxxxxx

Also my kapenter pod logs

2023-06-05T03:41:05.736Z DEBUG Successfully created the logger. 2023-06-05T03:41:05.736Z DEBUG Logging level set to: debug {"level":"info","ts":1685936465.7392108,"logger":"fallback","caller":"injection/injection.go:63","msg":"Starting informers..."} 2023-06-05T03:41:05.839Z DEBUG controller waiting for configmaps {"commit": "698f22f-dirty"} 2023-06-05T03:41:06.438Z DEBUG controller waiting for configmaps {"commit": "698f22f-dirty"} 2023-06-05T03:41:06.938Z DEBUG controller waiting for configmaps {"commit": "698f22f-dirty"} 2023-06-05T03:41:07.439Z DEBUG controller waiting for configmaps {"commit": "698f22f-dirty"} 2023-06-05T03:41:07.940Z DEBUG controller waiting for configmaps {"commit": "698f22f-dirty"} 2023-06-05T03:41:08.499Z DEBUG controller discovered region {"commit": "698f22f-dirty", "region": "ap-east-1"} 2023-06-05T03:41:08.621Z DEBUG controller discovered cluster endpoint {"commit": "698f22f-dirty", "cluster-endpoint": "https://088DB5E389C0CBC8B6D2B960277004E6.gr7.ap-east-1.eks.amazonaws.com"} 2023-06-05T03:41:08.628Z DEBUG controller discovered kube dns {"commit": "698f22f-dirty", "kube-dns-ip": "172.20.0.10"} 2023-06-05T03:41:08.629Z DEBUG controller discovered version {"commit": "698f22f-dirty", "version": "v0.27.5"} 2023/06/05 03:41:08 Registering 2 clients 2023/06/05 03:41:08 Registering 2 informer factories 2023/06/05 03:41:08 Registering 3 informers 2023/06/05 03:41:08 Registering 5 controllers 2023-06-05T03:41:08.631Z INFO controller Starting server {"commit": "698f22f-dirty", "path": "/metrics", "kind": "metrics", "addr": "[::]:8080"} 2023-06-05T03:41:08.632Z INFO controller Starting server {"commit": "698f22f-dirty", "kind": "health probe", "addr": "[::]:8081"} I0605 03:41:08.732946 1 leaderelection.go:248] attempting to acquire leader lease karpenter/karpenter-leader-election... 2023-06-05T03:41:08.765Z INFO controller Starting informers... {"commit": "698f22f-dirty"} I0605 03:41:25.900526 1 leaderelection.go:258] successfully acquired lease karpenter/karpenter-leader-election 2023-06-05T03:41:25.900Z INFO controller.provisioner starting controller {"commit": "698f22f-dirty"} 2023-06-05T03:41:25.901Z INFO controller.deprovisioning starting controller {"commit": "698f22f-dirty"} 2023-06-05T03:41:25.901Z INFO controller.metric_scraper starting controller {"commit": "698f22f-dirty"} 2023-06-05T03:41:25.901Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "daemonset", "controllerGroup": "apps", "controllerKind": "DaemonSet", "source": "kind source: v1.DaemonSet"} 2023-06-05T03:41:25.901Z INFO controller Starting Controller {"commit": "698f22f-dirty", "controller": "daemonset", "controllerGroup": "apps", "controllerKind": "DaemonSet"} 2023-06-05T03:41:25.901Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "provisioner_trigger", "controllerGroup": "", "controllerKind": "Pod", "source": "kind source: v1.Pod"} 2023-06-05T03:41:25.901Z INFO controller Starting Controller {"commit": "698f22f-dirty", "controller": "provisioner_trigger", "controllerGroup": "", "controllerKind": "Pod"} 2023-06-05T03:41:25.901Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "pod_state", "controllerGroup": "", "controllerKind": "Pod", "source": "kind source: v1.Pod"} 2023-06-05T03:41:25.901Z INFO controller Starting Controller {"commit": "698f22f-dirty", "controller": "pod_state", "controllerGroup": "", "controllerKind": "Pod"} 2023-06-05T03:41:25.902Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "node_state", "controllerGroup": "", "controllerKind": "Node", "source": "kind source: v1.Node"} 2023-06-05T03:41:25.902Z INFO controller Starting Controller {"commit": "698f22f-dirty", "controller": "node_state", "controllerGroup": "", "controllerKind": "Node"} 2023-06-05T03:41:25.902Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "node", "controllerGroup": "", "controllerKind": "Node", "source": "kind source: v1.Node"} 2023-06-05T03:41:25.902Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "node", "controllerGroup": "", "controllerKind": "Node", "source": "kind source: v1alpha5.Provisioner"} 2023-06-05T03:41:25.902Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "node", "controllerGroup": "", "controllerKind": "Node", "source": "kind source: v1.Pod"} 2023-06-05T03:41:25.902Z INFO controller Starting Controller {"commit": "698f22f-dirty", "controller": "node", "controllerGroup": "", "controllerKind": "Node"} 2023-06-05T03:41:25.902Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "provisioner_state", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "source": "kind source: v1alpha5.Provisioner"} 2023-06-05T03:41:25.902Z INFO controller Starting Controller {"commit": "698f22f-dirty", "controller": "provisioner_state", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner"} 2023-06-05T03:41:25.902Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "pod_metrics", "controllerGroup": "", "controllerKind": "Pod", "source": "kind source: v1.Pod"} 2023-06-05T03:41:25.902Z INFO controller Starting Controller {"commit": "698f22f-dirty", "controller": "pod_metrics", "controllerGroup": "", "controllerKind": "Pod"} 2023-06-05T03:41:25.903Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "termination", "controllerGroup": "", "controllerKind": "Node", "source": "kind source: v1.Node"} 2023-06-05T03:41:25.903Z INFO controller Starting Controller {"commit": "698f22f-dirty", "controller": "termination", "controllerGroup": "", "controllerKind": "Node"} 2023-06-05T03:41:25.903Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "counter", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "source": "kind source: v1alpha5.Provisioner"} 2023-06-05T03:41:25.903Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "counter", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "source": "kind source: v1.Node"} 2023-06-05T03:41:25.903Z INFO controller Starting Controller {"commit": "698f22f-dirty", "controller": "counter", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner"} 2023-06-05T03:41:25.903Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "provisioner_metrics", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "source": "kind source: v1alpha5.Provisioner"} 2023-06-05T03:41:25.903Z INFO controller Starting Controller {"commit": "698f22f-dirty", "controller": "provisioner_metrics", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner"} 2023-06-05T03:41:25.903Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "awsnodetemplate", "controllerGroup": "karpenter.k8s.aws", "controllerKind": "AWSNodeTemplate", "source": "kind source: v1alpha1.AWSNodeTemplate"} 2023-06-05T03:41:25.903Z INFO controller Starting Controller {"commit": "698f22f-dirty", "controller": "awsnodetemplate", "controllerGroup": "karpenter.k8s.aws", "controllerKind": "AWSNodeTemplate"} 2023-06-05T03:41:25.903Z INFO controller Starting EventSource {"commit": "698f22f-dirty", "controller": "consistency", "controllerGroup": "", "controllerKind": "Node", "source": "kind source: *v1.Node"} 2023-06-05T03:41:25.903Z INFO controller Starting Controller {"commit": "698f22f-dirty", "controller": "consistency", "controllerGroup": "", "controllerKind": "Node"} 2023-06-05T03:41:25.903Z INFO controller.pricing starting controller {"commit": "698f22f-dirty"} 2023-06-05T03:41:25.968Z DEBUG controller hydrated launch template cache {"commit": "698f22f-dirty", "tag-key": "karpenter.k8s.aws/cluster", "tag-value": "socam-k8s-cluster-2", "count": 0} 2023-06-05T03:41:25.989Z INFO controller.pricing updated spot pricing with instance types and offerings {"commit": "698f22f-dirty", "instance-type-count": 636, "offering-count": 573} 2023-06-05T03:41:26.002Z INFO controller Starting workers {"commit": "698f22f-dirty", "controller": "pod_state", "controllerGroup": "", "controllerKind": "Pod", "worker count": 10} 2023-06-05T03:41:26.004Z INFO controller Starting workers {"commit": "698f22f-dirty", "controller": "consistency", "controllerGroup": "", "controllerKind": "Node", "worker count": 10} 2023-06-05T03:41:26.004Z DEBUG controller.deprovisioning waiting on cluster sync {"commit": "698f22f-dirty"} 2023-06-05T03:41:26.004Z INFO controller Starting workers {"commit": "698f22f-dirty", "controller": "daemonset", "controllerGroup": "apps", "controllerKind": "DaemonSet", "worker count": 10} 2023-06-05T03:41:26.004Z INFO controller Starting workers {"commit": "698f22f-dirty", "controller": "provisioner_trigger", "controllerGroup": "", "controllerKind": "Pod", "worker count": 10} 2023-06-05T03:41:26.006Z INFO controller Starting workers {"commit": "698f22f-dirty", "controller": "node_state", "controllerGroup": "", "controllerKind": "Node", "worker count": 10} 2023-06-05T03:41:26.006Z INFO controller Starting workers {"commit": "698f22f-dirty", "controller": "provisioner_state", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "worker count": 10} 2023-06-05T03:41:26.006Z INFO controller Starting workers {"commit": "698f22f-dirty", "controller": "pod_metrics", "controllerGroup": "", "controllerKind": "Pod", "worker count": 1} 2023-06-05T03:41:26.024Z INFO controller Starting workers {"commit": "698f22f-dirty", "controller": "node", "controllerGroup": "", "controllerKind": "Node", "worker count": 10} 2023-06-05T03:41:26.024Z INFO controller Starting workers {"commit": "698f22f-dirty", "controller": "termination", "controllerGroup": "", "controllerKind": "Node", "worker count": 100} 2023-06-05T03:41:26.024Z INFO controller Starting workers {"commit": "698f22f-dirty", "controller": "provisioner_metrics", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "worker count": 1} 2023-06-05T03:41:26.024Z INFO controller Starting workers {"commit": "698f22f-dirty", "controller": "awsnodetemplate", "controllerGroup": "karpenter.k8s.aws", "controllerKind": "AWSNodeTemplate", "worker count": 10} 2023-06-05T03:41:26.024Z INFO controller Starting workers {"commit": "698f22f-dirty", "controller": "counter", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "worker count": 10} 2023-06-05T03:41:26.099Z DEBUG controller.awsnodetemplate discovered subnets {"commit": "698f22f-dirty", "awsnodetemplate": "default", "subnets": ["subnet-0c25e4220474a6ce4 (ap-east-1a)", "subnet-0a60d7b2b1bfae40c (ap-east-1b)"]} 2023-06-05T03:41:26.199Z DEBUG controller.awsnodetemplate discovered security groups {"commit": "698f22f-dirty", "awsnodetemplate": "default", "security-groups": ["sg-0e072d7a2675ead26"]} 2023-06-05T03:41:27.237Z DEBUG controller.deprovisioning discovered instance types {"commit": "698f22f-dirty", "count": 194} 2029-06-05T03:41:27.286Z DEBUG controller.deprovisioning discovered offerings for instance types {"commit": "698f22f-dirty", "zones": ["ap-east-1a", "ap-east-1b"], "instance-type-count": 194, "node-template": "default"} 2023-06-05T03:41:27.498Z INFO controller.pricing updated on-demand pricing {"commit": "698f22f-dirty", "instance-type-count": 194}`

jicowan commented 1 year ago

Do you have to run c5a.2xlarge instances in ap-east-1a or can you use different instances types and availability zones?

engedaam commented 1 year ago

When you attempt to scale out, are there any pending pods?

hosonfung commented 1 year ago

no, all the pods are in running state

在 2023年6月5日週一 23:09,Amanuel Engeda @.***> 寫道:

When you attempt to scale out, are there any pending pods?

— Reply to this email directly, view it on GitHub https://github.com/aws/karpenter/issues/3995#issuecomment-1576985063, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHKYTT7DC6N7YFBFXUXH5TXJXZB3ANCNFSM6AAAAAAY2MRVMA . You are receiving this because you authored the thread.Message ID: @.***>

hosonfung commented 1 year ago

i can change tge instance type but want to keep the zone

在 2023年6月5日週一 23:02,Jeremy Cowan @.***> 寫道:

Do you have to run c5a.2xlarge instances in ap-east-1a or can you use different instances types and availability zones?

— Reply to this email directly, view it on GitHub https://github.com/aws/karpenter/issues/3995#issuecomment-1576970449, or unsubscribe https://github.com/notifications/unsubscribe-auth/APHKYTTNG57WZL233GWPHQLXJXYJPANCNFSM6AAAAAAY2MRVMA . You are receiving this because you authored the thread.Message ID: @.***>

engedaam commented 1 year ago

Karpenter will only consider scaling out if there are pending pods that can't be assigned to the nodes in the cluster, due to resources or other limitations. If you use requirements, you can select the zones the instance are launched.