Closed mpw96 closed 10 months ago
/close Not required at the moment by the stakeholders
Reason being, you can help around the initial problem with (e.g.):
clusterAutoscaler:
scaleDownDelayAfterAdd: 0s
scaleDownDelayAfterDelete: 0s
scaleDownDelayAfterFailure: 3m0s
scaleDownUnneededTime: 30s
scaleDownUtilizationThreshold: 0.5
scanInterval: 10s
expander: priority
maxNodeProvisionTime: 6m0s
maxGracefulTerminationSeconds: 600
newPodScaleUpDelay: 0s
maxEmptyBulkDelete: 10
Which component are you using?: cluster-autoscaler
Is your feature request designed to solve a problem? If so describe the problem this feature should solve.: During a maintenance window I need to quickly move workload away from a workergroup with minimal service disruption for the customer. I also want to control the resource overhead. I want to taint (all nodes of) a workergroup and drain a couple of them. When draining them is finished, I'd like to drain the next chunk and so on.
Describe the solution you'd like.: I'd like to be able (after draining a node) to delete it or have "someone" delete it. The most natural solution for me would be to just allow
kubectl delete node some-node
without the creation of some replacement node after some time. At the moment this does not work (see FAQ of cluster autoscaler and machine-controller-manager).Describe any alternative solutions you've considered.: Another solution would be something like setting an annotation on the node resource that tells Gardener, that I want to have this node deleted.
Additional context.: Had a discussion with @dguendisch @rishabh-11 and others already.