gardener / autoscaler

Customised fork of cluster-autoscaler to support machine-controller-manager
Apache License 2.0
16 stars 25 forks source link

Allow deletion of a node (True deletion API) #227

Closed mpw96 closed 10 months ago

mpw96 commented 1 year ago

Which component are you using?: cluster-autoscaler

Is your feature request designed to solve a problem? If so describe the problem this feature should solve.: During a maintenance window I need to quickly move workload away from a workergroup with minimal service disruption for the customer. I also want to control the resource overhead. I want to taint (all nodes of) a workergroup and drain a couple of them. When draining them is finished, I'd like to drain the next chunk and so on.

Describe the solution you'd like.: I'd like to be able (after draining a node) to delete it or have "someone" delete it. The most natural solution for me would be to just allow kubectl delete node some-node without the creation of some replacement node after some time. At the moment this does not work (see FAQ of cluster autoscaler and machine-controller-manager).

Describe any alternative solutions you've considered.: Another solution would be something like setting an annotation on the node resource that tells Gardener, that I want to have this node deleted.

Additional context.: Had a discussion with @dguendisch @rishabh-11 and others already.

rishabh-11 commented 10 months ago

/close Not required at the moment by the stakeholders

vlerenc commented 5 months ago

Reason being, you can help around the initial problem with (e.g.):

   clusterAutoscaler:
      scaleDownDelayAfterAdd: 0s
      scaleDownDelayAfterDelete: 0s
      scaleDownDelayAfterFailure: 3m0s
      scaleDownUnneededTime: 30s
      scaleDownUtilizationThreshold: 0.5
      scanInterval: 10s
      expander: priority
      maxNodeProvisionTime: 6m0s
      maxGracefulTerminationSeconds: 600
      newPodScaleUpDelay: 0s
      maxEmptyBulkDelete: 10