gardener / machine-controller-manager

Declarative way of managing machines for Kubernetes cluster
Apache License 2.0
257 stars 117 forks source link

Autoscaler interfering in meltdown scenario solution #741

Open himanshu-kun opened 2 years ago

himanshu-kun commented 2 years ago

How to categorize this issue?

/area performance /kind bug /priority 2

What happened: Autoscaler's fixNodeGroupSize logic interferes with meltdown logic where we remove only maxReplacement machines per machinedeployment, and it removes the other Unknown machines as well.

What you expected to happen: Autoscaler even on taking decision of DecreaseTargetSize should not be able to remove Unknown machines, because the node object is actually present for them.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?: This is happening because the way machineSet prioritizes machine while deletion based on their status https://github.com/gardener/machine-controller-manager/blob/d7e3c5dffeb33abe2c30b435075fb050301da4fa/pkg/controller/controller_utils.go#L769-L776

*We need to look into any other implication of prioritizing Pending machine over Unknown machines for solution.

Environment:

himanshu-kun commented 2 years ago

cc @unmarshall

himanshu-kun commented 1 year ago

fixNodeGrpSize only understands Registered and Non-Registered nodes, it doesn't do anything even if the node joins and is NotReady for a long time. Here the intention of fixNodeGrpSize was to remove the Pending machine, but because of our preference in machineSet controller it removes the Unknown machine. The fixNodeGrpSize currently just acts when the RemoveLongUnregistered logic is not able to remove the longUnregistered nodes because node grp is already at the minimum. Also if we are not at min. node grp size, the RemoveLongUnregistered logic wouldn't delete the Unknown machine because it uses the priority annotation to pinpoint the machine it wants to delete and as per machineSet preference priority annotation is preffered over machine phase.

himanshu-kun commented 1 year ago

Prioritizing Pending machine removal over Unknown would make more sense because:

himanshu-kun commented 1 year ago

Also if we are not at min. node grp size, the RemoveLongUnregistered logic wouldn't delete the Unknown machine because it uses the priority annotation to pinpoint the machine it wants to delete and as per machineSet preference priority annotation is preffered over machine phase.

But the RemoveLongUnregistered/RemoveOldUnregistered logic will remove the Pending machine if autoscaler maxNodeProvisionTimeout runs out, thinking it long unregistered. This will again kick in a loop where machine deployment size is reduced and then meltdown logic again turns maxReplacement machines into Pending and finally it'll stop when node grp min size is reached as that time RemoveLongUnregistered logic would stop.

The ideal solution is to make autoscaler aware that the meltdown logic is in play because of an outage in the zone, and it doesn't interfere.