Priority Expander choose additional node-group in second cycle

liorfranko commented 2 years ago

Which component are you using?: cluster-autoscaler with the Helm Chart

What version of the component are you using?:

Component version: v1.20.3

What k8s version are you using (kubectl version)?:

kubectl version Output

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.4-eks-6b7464", GitCommit:"6b746440c04cb81db4426842b4ae65c3f7035e53", GitTreeState:"clean", BuildDate:"2021-03-19T19:35:50Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.16-eks-de875a99", GitCommit:"de875a995f7da9bbc6c47660b3cccf622e8ae60f", GitTreeState:"clean", BuildDate:"2022-05-11T23:16:06Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}

What environment is this in?: AWS EKS using Managed Node Groups

What did you expect to happen?:

With this configuration for priority expander:

apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-autoscaler-priority-expander
  namespace: kube-system
data:
  priorities: |-
    20:
      - .*Ec2Spot.*
    10:
      - .*OnDemand.*

And these flags:

    Command:
      ./cluster-autoscaler
      --cloud-provider=aws
      --namespace=kube-system
      --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/general-dev
      --balance-similar-node-groups=false
      --expander=priority
      --leader-elect=true
      --logtostderr=true
      --max-empty-bulk-delete=20
      --max-node-provision-time=15m0s
      --scale-down-delay-after-add=5m
      --scale-down-unneeded-time=5m
      --skip-nodes-with-local-storage=false
      --skip-nodes-with-system-pods=false
      --stderrthreshold=info
      --v=4

What happened instead?:

The first cycle works as expected:

I0710 09:05:17.305300       1 static_autoscaler.go:229] Starting main loop
I0710 09:05:17.310216       1 filter_out_schedulable.go:65] Filtering out schedulables
I0710 09:05:17.310236       1 filter_out_schedulable.go:132] Filtered out 0 pods using hints
I0710 09:05:17.311532       1 filter_out_schedulable.go:170] 9 pods were kept as unschedulable based on caching
I0710 09:05:17.311546       1 filter_out_schedulable.go:171] 0 pods marked as unschedulable can be scheduled.
I0710 09:05:17.311560       1 filter_out_schedulable.go:82] No schedulable pods
I0710 09:05:17.311576       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-jszzc is unschedulable
I0710 09:05:17.311580       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-jn94n is unschedulable
I0710 09:05:17.311584       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-m9b8w is unschedulable
I0710 09:05:17.311588       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-ffxnd is unschedulable
I0710 09:05:17.311591       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-kl7db is unschedulable
I0710 09:05:17.311595       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-sklrh is unschedulable
I0710 09:05:17.311600       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-2j54k is unschedulable
I0710 09:05:17.311605       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-szwn7 is unschedulable
I0710 09:05:17.311609       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-9fqpm is unschedulable
I0710 09:05:17.311614       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-vvrgj is unschedulable
I0710 09:05:17.311724       1 scale_up.go:364] Upcoming 0 nodes
.
.
.
I0710 09:05:17.316476       1 priority.go:118] Successfully loaded priority configuration from configmap.
W0710 09:05:17.316508       1 priority.go:91] Priority expander: node group general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003 not found in priority expander configuration. The group won't be used.
I0710 09:05:17.316512       1 priority.go:166] priority expander: general-dev-devops-apps-4vcpu-16gb-Ec2Spot-MultiAZ20220707191748241500000001 chosen as the highest available
I0710 09:05:17.316523       1 scale_up.go:456] Best option to resize: general-dev-devops-apps-4vcpu-16gb-Ec2Spot-MultiAZ20220707191748241500000001
I0710 09:05:17.316533       1 scale_up.go:460] Estimated 10 nodes needed in general-dev-devops-apps-4vcpu-16gb-Ec2Spot-MultiAZ20220707191748241500000001
I0710 09:05:17.316710       1 scale_up.go:574] Final scale-up plan: [{general-dev-devops-apps-4vcpu-16gb-Ec2Spot-MultiAZ20220707191748241500000001 3->13 (max: 20)}]
I0710 09:05:17.316727       1 scale_up.go:663] Scale-up: setting group general-dev-devops-apps-4vcpu-16gb-Ec2Spot-MultiAZ20220707191748241500000001 size to 13
I0710 09:05:17.316737       1 auto_scaling_groups.go:219] Setting asg general-dev-devops-apps-4vcpu-16gb-Ec2Spot-MultiAZ20220707191748241500000001 size to 13
I0710 09:05:17.317061       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-system", Name:"cluster-autoscaler-priority-expander", UID:"4287ad42-fd31-4163-9924-632d4c86098b", APIVersion:"v1", ResourceVersion:"495893311", FieldPath:""}): type: 'Warning' reason: 'PriorityConfigMapNotMatchedGroup' Priority expander: node group general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003 not found in priority expander configuration. The group won't be used.

But on the next cycle, 10 seconds later, the expander is triggered again and now the OnDemand instances are chosen:

I0710 09:05:27.579680       1 static_autoscaler.go:229] Starting main loop
I0710 09:05:27.585476       1 filter_out_schedulable.go:65] Filtering out schedulables
I0710 09:05:27.585497       1 filter_out_schedulable.go:132] Filtered out 0 pods using hints
I0710 09:05:27.585679       1 filter_out_schedulable.go:157] Pod devops-apps-01.sleep-87f4c7f8c-kl7db marked as unschedulable can be scheduled on node template-node-for-general-dev-devops-apps-4vcpu-16gb-Ec2Spot-MultiAZ20220707191748241500000001-2039826811414219716-1. Ignoring in scale up.
I0710 09:05:27.587179       1 filter_out_schedulable.go:170] 8 pods were kept as unschedulable based on caching
I0710 09:05:27.587202       1 filter_out_schedulable.go:171] 1 pods marked as unschedulable can be scheduled.
I0710 09:05:27.587214       1 filter_out_schedulable.go:79] Schedulable pods present
I0710 09:05:27.587231       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-jszzc is unschedulable
I0710 09:05:27.587239       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-jn94n is unschedulable
I0710 09:05:27.587244       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-m9b8w is unschedulable
I0710 09:05:27.587250       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-ffxnd is unschedulable
I0710 09:05:27.587259       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-vvrgj is unschedulable
I0710 09:05:27.587263       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-sklrh is unschedulable
I0710 09:05:27.587267       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-2j54k is unschedulable
I0710 09:05:27.587272       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-szwn7 is unschedulable
I0710 09:05:27.587277       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-9fqpm is unschedulable
I0710 09:05:27.587401       1 scale_up.go:364] Upcoming 10 nodes
.
.
.
I0710 09:05:27.592210       1 priority.go:118] Successfully loaded priority configuration from configmap.
I0710 09:05:27.592235       1 priority.go:166] priority expander: general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003 chosen as the highest available
I0710 09:05:27.592246       1 scale_up.go:456] Best option to resize: general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003
I0710 09:05:27.592252       1 scale_up.go:460] Estimated 9 nodes needed in general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003
I0710 09:05:27.592412       1 scale_up.go:574] Final scale-up plan: [{general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003 1->10 (max: 120)}]
I0710 09:05:27.592429       1 scale_up.go:663] Scale-up: setting group general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003 size to 10
I0710 09:05:27.592453       1 auto_scaling_groups.go:219] Setting asg general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003 size to 10
I0710 09:05:27.592490       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-system", Name:"cluster-autoscaler-status", UID:"bfbf5ac2-055d-499d-a02c-c60edee1a058", APIVersion:"v1", ResourceVersion:"495947584", FieldPath:""}): type: 'Normal' reason: 'ScaledUpGroup' Scale-up: setting group general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003 size to 10
I0710 09:05:27.891152       1 eventing_scale_up_processor.go:47] Skipping event processing for unschedulable pods since there is a ScaleUp attempt this loop

I tried switching the priorities to be like that:

data:
  priorities: |-
    10:
      - .*OnDemand.*
    20:
      - .*Ec2Spot.*

And got the same results.

I disabled and enabled balance-similar-node-groups and got the same results.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

liorfranko commented 2 years ago

Update: It's not related to the priority expander. Even when I remove the --expander=priority flag, and perform the same test I see the same behaviour:

I0710 13:51:38.598228       1 static_autoscaler.go:229] Starting main loop
I0710 13:51:38.602935       1 filter_out_schedulable.go:65] Filtering out schedulables
I0710 13:51:38.602956       1 filter_out_schedulable.go:132] Filtered out 0 pods using hints
I0710 13:51:38.604265       1 filter_out_schedulable.go:170] 9 pods were kept as unschedulable based on caching
I0710 13:51:38.604279       1 filter_out_schedulable.go:171] 0 pods marked as unschedulable can be scheduled.
I0710 13:51:38.604291       1 filter_out_schedulable.go:82] No schedulable pods
I0710 13:51:38.604308       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-nzwll is unschedulable
I0710 13:51:38.604312       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-l2sx8 is unschedulable
I0710 13:51:38.604315       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-w7zk2 is unschedulable
I0710 13:51:38.604319       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-mfcdg is unschedulable
I0710 13:51:38.604322       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-n8976 is unschedulable
I0710 13:51:38.604325       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-jjcrw is unschedulable
I0710 13:51:38.604328       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-p7zqj is unschedulable
I0710 13:51:38.604331       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-k69d8 is unschedulable
I0710 13:51:38.604334       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-qm9nv is unschedulable
I0710 13:51:38.604337       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-24s9n is unschedulable
I0710 13:51:38.604439       1 scale_up.go:364] Upcoming 0 nodes
.
.
.
I0710 13:51:38.609496       1 scale_up.go:456] Best option to resize: general-dev-devops-apps-4vcpu-16gb-Ec2Spot-MultiAZ20220707191748241500000001
I0710 13:51:38.609501       1 scale_up.go:460] Estimated 10 nodes needed in general-dev-devops-apps-4vcpu-16gb-Ec2Spot-MultiAZ20220707191748241500000001
I0710 13:51:38.609705       1 scale_up.go:574] Final scale-up plan: [{general-dev-devops-apps-4vcpu-16gb-Ec2Spot-MultiAZ20220707191748241500000001 3->13 (max: 20)}]
I0710 13:51:38.609730       1 scale_up.go:663] Scale-up: setting group general-dev-devops-apps-4vcpu-16gb-Ec2Spot-MultiAZ20220707191748241500000001 size to 13
I0710 13:51:38.609752       1 auto_scaling_groups.go:219] Setting asg general-dev-devops-apps-4vcpu-16gb-Ec2Spot-MultiAZ20220707191748241500000001 size to 13
.
.
.
I0710 13:51:48.763615       1 static_autoscaler.go:229] Starting main loop
I0710 13:51:48.770761       1 filter_out_schedulable.go:65] Filtering out schedulables
I0710 13:51:48.770807       1 filter_out_schedulable.go:132] Filtered out 0 pods using hints
I0710 13:51:48.771022       1 filter_out_schedulable.go:157] Pod devops-apps-01.sleep-87f4c7f8c-n8976 marked as unschedulable can be scheduled on node template-node-for-general-dev-devops-apps-4vcpu-16gb-Ec2Spot-MultiAZ20220707191748241500000001-7247793932385268851-1. Ignoring in scale up.
I0710 13:51:48.772494       1 filter_out_schedulable.go:170] 8 pods were kept as unschedulable based on caching
I0710 13:51:48.772514       1 filter_out_schedulable.go:171] 1 pods marked as unschedulable can be scheduled.
I0710 13:51:48.772525       1 filter_out_schedulable.go:79] Schedulable pods present
I0710 13:51:48.772541       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-jjcrw is unschedulable
I0710 13:51:48.772548       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-nzwll is unschedulable
I0710 13:51:48.772551       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-l2sx8 is unschedulable
I0710 13:51:48.772554       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-w7zk2 is unschedulable
I0710 13:51:48.772557       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-mfcdg is unschedulable
I0710 13:51:48.772561       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-24s9n is unschedulable
I0710 13:51:48.772564       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-p7zqj is unschedulable
I0710 13:51:48.772568       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-k69d8 is unschedulable
I0710 13:51:48.772571       1 klogx.go:86] Pod devops-apps-01/sleep-87f4c7f8c-qm9nv is unschedulable
I0710 13:51:48.772690       1 scale_up.go:364] Upcoming 10 nodes
.
.
.
I0710 13:51:48.776395       1 scale_up.go:456] Best option to resize: general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003
I0710 13:51:48.776407       1 scale_up.go:460] Estimated 9 nodes needed in general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003
I0710 13:51:48.776575       1 scale_up.go:574] Final scale-up plan: [{general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003 1->10 (max: 120)}]
I0710 13:51:48.776591       1 scale_up.go:663] Scale-up: setting group general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003 size to 10
I0710 13:51:48.776607       1 auto_scaling_groups.go:219] Setting asg general-dev-devops-apps-4vcpu-16gb-OnDemand-MultiAZ20220710050104303500000003 size to 10

liorfranko commented 2 years ago

It looks like it's related to https://github.com/kubernetes/autoscaler/issues/4082 Is it possible to add this commit to the next 1.19 version? https://github.com/kubernetes/autoscaler/pull/3883

liorfranko commented 2 years ago

I've created a PR to add #3883 to 1.2.x I've tested it and it works. https://github.com/kubernetes/autoscaler/pull/5015

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot commented 1 year ago

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to [this](https://github.com/kubernetes/autoscaler/issues/5014#issuecomment-1353011409): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue with `/reopen` >- Mark this issue as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close not-planned > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

kubernetes / autoscaler

Priority Expander choose additional node-group in second cycle #5014