Closed andrewhibbert closed 2 years ago
Can you provide an example scenario where Karpenter deletes an empty node it just launched, even though pods are pending?
I'll try, I was just seeing a huge amount of turnover when consolidation was switched on, seeing that newer nodes were being picked and then seeing extra ones provisioned. Seems like it needs to be more configurable to leave empty(ish) nodes in place for a period of time before removing them.
It delays consolidation uses an automatically adjusted stabilization window, if there are pending pods/unready replicasets/etc. this window grows up to 5 minutes long. It also waits for a node to be fully initialized (ready, all extended resources registered, any startup taints removed ,etc.) before considering it. The net effect is that it shouldn't delete a newly launched empty node as there will be pending pods which will delay any consolidation decisions.
Do you have the logs for when this occurred?
Just switched it on
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:07:43.064Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["gpt-normalization-service-async"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-00/enrichment-services-np-00-gpt-normalization-service-async-t6nrz"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:07:43.064Z INFO controller.consolidation Consolidating via Delete, terminating 1 nodes ip-10-138-104-69.eu-west-1.compute.internal/c5a.2xlarge {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:07:43.092Z INFO controller.termination Cordoned node {"commit": "b157d45", "node": "ip-10-138-104-69.eu-west-1.compute.internal"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:08:20.631Z INFO controller.termination Deleted node {"commit": "b157d45", "node": "ip-10-138-104-69.eu-west-1.compute.internal"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:12:51.492Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["gpt-normalization-service-async"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-00/enrichment-services-np-00-gpt-normalization-service-async-p672c"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:12:51.497Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["vtw-fetcher-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-00/enrichment-services-np-00-vtw-fetcher-service-sync-77cfb58hkbkr"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:12:51.500Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["vtw-fetcher-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-01/enrichment-services-np-01-vtw-fetcher-service-sync-5d4986c557tt"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:12:51.501Z INFO controller.consolidation Consolidating via Delete, terminating 1 nodes ip-10-138-100-234.eu-west-1.compute.internal/m3.2xlarge {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:12:51.533Z INFO controller.termination Cordoned node {"commit": "b157d45", "node": "ip-10-138-100-234.eu-west-1.compute.internal"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:13:33.795Z INFO controller.termination Deleted node {"commit": "b157d45", "node": "ip-10-138-100-234.eu-west-1.compute.internal"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:17:59.825Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["gpt-normalization-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-00/enrichment-services-np-00-gpt-normalization-service-sync-54hpbf"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:17:59.830Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["rx-chem-client-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-99/enrichment-services-np-99-rx-chem-client-service-sync-6c97dbkks"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:17:59.833Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["entity-filter-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-01/enrichment-services-np-01-entity-filter-service-sync-5fb8fdqlz2"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:17:59.837Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["rx-reg-client-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-01/enrichment-services-np-01-rx-reg-client-service-sync-795ddkmv4g"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:17:59.840Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["termite-service-async"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-50/enrichment-services-np-50-termite-service-async-55fdc685c9crnlw"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:17:59.841Z INFO controller.consolidation Consolidating via Delete, terminating 1 nodes ip-10-138-106-46.eu-west-1.compute.internal/c5a.2xlarge {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:17:59.866Z INFO controller.termination Cordoned node {"commit": "b157d45", "node": "ip-10-138-106-46.eu-west-1.compute.internal"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:18:42.128Z INFO controller.termination Deleted node {"commit": "b157d45", "node": "ip-10-138-106-46.eu-west-1.compute.internal"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:23:08.679Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["gpt-normalization-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-99/enrichment-services-np-99-gpt-normalization-service-sync-7zq8m5"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:23:08.683Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["field-extraction-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-99/enrichment-services-np-99-field-extraction-service-sync-58nsbc6"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:23:08.687Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["ftd-feeder-service-async"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-00/enrichment-services-np-00-ftd-feeder-service-async-5758b66qmx25"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:23:08.690Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["caas-document-index-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-50/enrichment-services-np-50-caas-document-index-service-sync2vhjn"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:23:08.693Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["classification-filter-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-00/enrichment-services-np-00-classification-filter-service-sy5br4r"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:23:08.696Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["termite-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-00/enrichment-services-np-00-termite-service-sync-684dcb49bc-99c4h"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:23:08.698Z INFO controller.consolidation Consolidating via Delete, terminating 1 nodes ip-10-138-103-62.eu-west-1.compute.internal/c5a.2xlarge {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:23:08.721Z INFO controller.termination Cordoned node {"commit": "b157d45", "node": "ip-10-138-103-62.eu-west-1.compute.internal"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:23:49.380Z INFO controller.termination Deleted node {"commit": "b157d45", "node": "ip-10-138-103-62.eu-west-1.compute.internal"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:17.537Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["classification-filter-service-async"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-00/enrichment-services-np-00-classification-filter-service-ast2tg5"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:17.541Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["entity-filter-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-01/enrichment-services-np-01-entity-filter-service-sync-5fb8f9b4zr"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:17.544Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["datalake-write-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-00/enrichment-services-np-00-datalake-write-service-sync-7458m4rt9"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:17.546Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["datalake-write-service-async"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-00/enrichment-services-np-00-datalake-write-service-async-847cxdkj"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:17.549Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["datalake-read-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-00/enrichment-services-np-00-datalake-read-service-sync-cf9f6rph9c"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:17.552Z DEBUG controller.consolidation Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["datalake-read-service-async"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-00/enrichment-services-np-00-datalake-read-service-async-66f5k8bhr"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:17.553Z INFO controller.consolidation Consolidating via Delete, terminating 1 nodes ip-10-138-104-127.eu-west-1.compute.internal/c5a.2xlarge {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:17.587Z INFO controller.termination Cordoned node {"commit": "b157d45", "node": "ip-10-138-104-127.eu-west-1.compute.internal"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.715Z DEBUG controller.provisioning 14 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.716Z DEBUG controller.provisioning 14 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.716Z DEBUG controller.provisioning 200 out of 542 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.716Z DEBUG controller.provisioning 188 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.717Z DEBUG controller.provisioning 29 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.717Z DEBUG controller.provisioning 14 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.718Z DEBUG controller.provisioning 7 out of 542 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.718Z DEBUG controller.provisioning 55 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.719Z DEBUG controller.provisioning Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0]={"weight":1,"podAffinityTerm":{"labelSelector":{"matchExpressions":[{"key":"component","operator":"In","values":["entity-filter-service-sync"]}]},"topologyKey":"failure-domain.beta.kubernetes.io/zone"}} {"commit": "b157d45", "pod": "enrichment-services-np-01/enrichment-services-np-01-entity-filter-service-sync-5fb8f9t6vd"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.720Z DEBUG controller.provisioning 14 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.720Z DEBUG controller.provisioning 14 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.721Z DEBUG controller.provisioning 200 out of 542 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.721Z DEBUG controller.provisioning 188 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.721Z DEBUG controller.provisioning 29 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.722Z DEBUG controller.provisioning 14 out of 509 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.722Z DEBUG controller.provisioning 7 out of 542 instance types were excluded because they would breach provisioner limits {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.728Z INFO controller.provisioning Found 1 provisionable pod(s) {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.728Z INFO controller.provisioning Computed 1 new node(s) will fit 1 pod(s) {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.731Z INFO controller.provisioning Launching node with 1 pods requesting {"cpu":"991m","memory":"3876970752","pods":"10"} from types inf1.2xlarge, c3.2xlarge, r3.2xlarge, c5a.2xlarge, t3a.2xlarge and 91 other(s) {"commit": "b157d45", "provisioner": "enrichment-service"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.999Z DEBUG controller.provisioning.cloudprovider Created launch template, Karpenter-nonprod-shared1-1433741913800812631 {"commit": "b157d45", "provisioner": "enrichment-service"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:22.014Z INFO controller.provisioning.cloudprovider Launched instance: i-0acdf6e3d9e03c063, hostname: ip-10-138-110-171.eu-west-1.compute.internal, type: c5a.2xlarge, zone: eu-west-1c, capacityType: spot {"commit": "b157d45", "provisioner": "enrichment-service"}
These aren't new nodes, although I suspect over time it may get more apparent. You can see however that it has cordoned one node:
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:17.553Z INFO controller.consolidation Consolidating via Delete, terminating 1 nodes ip-10-138-104-127.eu-west-1.compute.internal/c5a.2xlarge {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:17.587Z INFO controller.termination Cordoned node {"commit": "b157d45", "node": "ip-10-138-104-127.eu-west-1.compute.internal"}
Then it has had to spin up another one exactly the same, 2 seconds later
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.728Z INFO controller.provisioning Found 1 provisionable pod(s) {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.728Z INFO controller.provisioning Computed 1 new node(s) will fit 1 pod(s) {"commit": "b157d45"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.731Z INFO controller.provisioning Launching node with 1 pods requesting {"cpu":"991m","memory":"3876970752","pods":"10"} from types inf1.2xlarge, c3.2xlarge, r3.2xlarge, c5a.2xlarge, t3a.2xlarge and 91 other(s) {"commit": "b157d45", "provisioner": "enrichment-service"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:19.999Z DEBUG controller.provisioning.cloudprovider Created launch template, Karpenter-nonprod-shared1-1433741913800812631 {"commit": "b157d45", "provisioner": "enrichment-service"}
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:22.014Z INFO controller.provisioning.cloudprovider Launched instance: i-0acdf6e3d9e03c063, hostname: ip-10-138-110-171.eu-west-1.compute.internal, type: c5a.2xlarge, zone: eu-west-1c, capacityType: spot {"commit": "b157d45", "provisioner": "enrichment-service"}
The other one is then deleted a short time later:
karpenter-7fd86b488d-b7wfs controller 2022-09-06T18:28:59.879Z INFO controller.termination Deleted node {"commit": "b157d45", "node": "ip-10-138-104-127.eu-west-1.compute.internal"}
I just think possibly it is going a bit quick and resulting in churn of nodes/pods and needs to be more configurable
Were new pods being created at this time or were they solely from evictions? If it was just from an eviction, I would expect it to repeat since it replaced the node with an identical node. If it does, can you run kubectl describe pod pod-name
for the pod that was evicted to determine why kube-scheduler thought it wouldn't schedule?
The only scenarios I can see causing this are: 1) A new pod was created 2) Our scheduling logic has a mistake and thinks the pod should fit on another node, but it can't.
I don't see option 2 as being likely, since if it occurred we wouldn't launch the second node. Karpenter would continue to think the pod would schedule on an existing node.
I have the same kind of behavior :
Karpenter: v0.16.0 Kubernetes: v1.21.14
Log extract :
Date,Service,Message
"2022-09-07T16:00:25.485Z","""karpenter""","Deleted node"
"2022-09-07T16:00:25.290Z","""karpenter""","Cordoned node"
"2022-09-07T16:00:25.258Z","""karpenter""","Consolidating via Delete (empty node), terminating 1 nodes ip-10-36-113-215.eu-west-3.compute.internal/m5a.xlarge"
"2022-09-07T15:59:11.346Z","""karpenter""","Launched instance: i-050a61e60159e2a55, hostname: ip-10-36-113-215.eu-west-3.compute.internal, type: m5a.xlarge, zone: eu-west-3a, capacityType: on-demand"
"2022-09-07T15:59:09.206Z","""karpenter""","Launching node with 1 pods requesting {""cpu"":""1390m"",""memory"":""2098Mi"",""pods"":""7""} from types m5a.xlarge, m5.xlarge, m6i.xlarge, m5ad.xlarge, m5d.xlarge and 31 other(s)"
"2022-09-07T15:59:09.205Z","""karpenter""","Computed 1 new node(s) will fit 1 pod(s)"
"2022-09-07T15:59:09.205Z","""karpenter""","Found 1 provisionable pod(s)"
It seems like the " 1 provisionable pod " finally fit in actual node and the new node is empty. On the next consolidation cycle.
There are no eviction at the same time
Is this reproducible? I think the most likely scenario for your log @jBouyoud is that another pod was terminated during the node launch which made space available so the new node was unnecessary. It's not possible to tell that from the log, but Karpenter only considers pods provisionable that kube-scheduler has marked as unschedulable. It must have given up on scheduling the pod before Karpenter would have created the node.
Is this reproducible? I think the most likely scenario for your log @jBouyoud is that another pod was terminated during the node launch which made space available so the new node was unnecessary. It's not possible to tell that from the log, but Karpenter only considers pods provisionable that kube-scheduler has marked as unschedulable. It must have given up on scheduling the pod before Karpenter would have created the node.
@tzneal, Yes it happens several times a day on our working hours while people are deploying stuff (means multiple deployment rollout) on the cluster. This behavior is directly linked to deployment activity.
Your scenario seems legit but I think the pod as been scheduled to another node (where another pod as been terminated).
Does increasing the batchIdleDuration
may help to reduce this noise ?
Does karpenter take care about currently bootstraping node before scheduling a new one ?
When this behavior append pod are unschedulable due to a not-ready, node.
30s (x4 over 40s) default-scheduler 0/30 nodes are available: 1 Insufficient memory, 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate, 1 node(s) were unschedulable, 23 Insufficient cpu, 5 node(s) had taint {...}, that the pod didn't tolerate.
@jBouyoud When replacing a node it will. In the log you provided, it was just deleting a node as it thought the pods could fit elsewhere.
How are the deployments being performed? Is the deployment being deleted and re-created, or just an image changed? I'm trying to reproduce this locally.
How are the deployments being performed? Is the deployment being deleted and re-created, or just an image changed? I'm trying to reproduce this locally.
Never deleted and recreated. Mainly (99.5%) of image version change.
Another exemple more complete for a deployment rollout :
Warning FailedScheduling 82s (x2 over 84s) default-scheduler 0/29 nodes are available: 1 node(s) were unschedulable, 2 Insufficient memory, 22 Insufficient cpu, 5 node(s) had taint {xxxxxxxx}, that the pod didn't tolerate.
Normal Scheduled 22s default-scheduler Successfully assigned xxxx/XXXXXX-b64466dfd-f99h2 to ip-10-36-200-186.eu-west-3.compute.internal
Normal Pulled 21s kubelet Container image "hashicorp/vault:1.11.2@sha256:a60891bfb7b7a669d21544e0ad1b178e09a78174d4995e79fb11faf9a741e2ca" already present on machine
Normal Created 21s kubelet Created container vault-agent-init
Normal Nominate 21s karpenter Pod should schedule on ip-10-36-250-77.eu-west-3.compute.internal
Normal Started 21s kubelet Started container vault-agent-init
And in the same time in karpenter controller
{"level":"INFO","time":"2022-09-08T13:09:27.622Z","logger":"controller.provisioning","message":"Found 1 provisionable pod(s)","commit":"b157d45"}
{"level":"INFO","time":"2022-09-08T13:09:27.622Z","logger":"controller.provisioning","message":"Computed 1 new node(s) will fit 1 pod(s)","commit":"b157d45"}
{"level":"INFO","time":"2022-09-08T13:09:27.624Z","logger":"controller.provisioning","message":"Launching node with 1 pods requesting {\"cpu\":\"1390m\",\"memory\":\"2098Mi\",\"pods\":\"7\"} from types t3a.xlarge, c5a.xlarge, t3.xlarge, c5.xlarge, c6i.xlarge and 100 other(s)","commit":"b157d45","provisioner":"default"}
{"level":"INFO","time":"2022-09-08T13:09:29.643Z","logger":"controller.provisioning.cloudprovider","message":"Launched instance: i-08e80998cf3c3e208, hostname: ip-10-36-250-77.eu-west-3.compute.internal, type: t3a.xlarge, zone: eu-west-3c, capacityType: on-demand","commit":"b157d45","provisioner":"default"}
I hope this helps Feel free to ask if you need more information
@jBouyoud What is the update strategy for one of these deployments? I'm still working on reproducing this locally.
Almost all of our workloads use this : (the replicas number can be [1,5])
progressDeadlineSeconds: 600
terminationGracePeriodSeconds: 60
replicas: 2
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
Our apps are mainly composed by 3 deployments. So when a version change we have at least 3 deployment rollout in the same time.
I don't think this is very important for Karpenter (and for this case) but we also have some initContainers
(with different ressources spec from the main container).
I'm still looking into this, but I can reproduce the extra node that you see that gets deleted. It's because of the surge in your update strategy, in your case if you have 5 pods and update the deployment image it will surge to 25 % or +2 pods, if they don't all fit in your cluster the new small node will be launched. Eventually the update finishes rolling through and the new node isn't needed.
👌 Seems legit. Thanks
Probably, a way of tuning agressivity of consolidation process may helps to adjust this case (but not the topic here).
Thanks for the explanation.
I'm going to close this then, but feel free to re-open or create a new issue if you see something that isn't explained by surge.
I think in my case it might be because it does this:
We end up with extra nodes becuase the topology is relaxed. Perhaps the topolgy should be relaxed before adding a new node?
@andrewhibbert Sorry, I'm not following. I added some information inline regarding what Karpenter does when scheduling below:
This bit I mean - https://github.com/aws/karpenter/blob/main/pkg/controllers/provisioning/scheduling/scheduler.go#L123, removes a pod anti affinity. Anyway it likely is correctly done
Version
Karpenter: v0.0.0
Kubernetes: v1.0.0
Expected Behavior
Karpenter should allow a configurable amount of time for a node to scale up and have running pods before scaling them down.
Actual Behavior
Karpenter is calculating a disruption score and ordering by time the node has come up looking at older ones first. But this will still pick newer empty nodes first
Steps to Reproduce the Problem
Resource Specs and Logs