[EKS] [bug]: auto-scaling group ends up in a bad state after `kubectl delete node`

Aleksei-Poliakov commented 2 years ago

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request Currently whenever kubectl delete node command is ran in the cluster - node is removed from k8s, but the EC2 instance behind the node is not terminated. As a result AWS auto-scaling group behind k8s node group does not create new EC2 instances, which also breaks things like cluster auto-scaler.

An example would look like this:

In the initial setup you have ASG with 5 EC2 instances (desired size is 5), all onboarded as nodes on k8s cluster.
kubectl detele node command runs in the cluster, removing a single node
ASG still has "desired size = 5", yet opening the "nodes" tab you can see only 4 nodes.
Since a node was removed, auto-scaling controller may decided to ask for an additional node to be created (e.g. to handle scale-up)
Yet this request would not be handed by ASG, because according to it there are already 5 nodes available.

The only way I know of how to resolve the situation is to MANUALLY find out EC2 instance that is no longer mapped to a node in k8s cluster and terminate it, then ASG would pick this information up and continue handling auto-scaler requests.

Which service(s) is this request for? EKS, ASG

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? There is no particular need to use kubectl delete node, but having this behavior in the system is very dangerous. I ended up in this situation because I wanted to get rid of nodes that seemed to be poisoned (pods running on these were performing worse than pods of same service running on all other nodes in the cluster) - it turned out the issue was totally unrelated, but in doing kubectl delete node I messed up the cluster and put it into a bad state that required a fair amount of effort to get to the bottom of.

Are you currently working around this issue? Yes, manually deleting EC2 instance is a viable workaround

Additional context You can see more details in:

this StackOverflow thread where a different person stumbled on this problem before: https://stackoverflow.com/questions/57554812/my-nodes-got-deleted-in-eks-how-can-i-recover
In this support request where there are steps from support engineer who reproduced the problem: https://support.console.aws.amazon.com/support/home?region=us-east-1#/case/?displayId=10511080081

nalshamaajc commented 1 year ago

Should we expect Managed Node Groups to be able to figure out that difference and cycle (terminate and create a new one) the deleted nodes?

Aleksei-Poliakov commented 1 year ago

I believe yes, the deleted node should be terminated in this scenario. To be clear - there are ways already in the ecosystem to safely remove an EC2 instance from the cluster by cordoning the node and then detaching it from the ASG; so if the user explicitly asked to delete a node - it seems totally reasonable that the EC2 instance behind it is also deleted, and most importantly the node group itself remains "healthy" (e.g. does not prevent scaling up).

nalshamaajc commented 1 year ago

yes there should, my question was more toward AWS adding this feature, which I think can be optionally enabled.

wonko commented 12 months ago

Hitting the same issue here.

For completeness, a kubectl delete node xxx on either GCP or Azure will actually terminate the backing VM as well, allowing for complete node management from within kubernetes.

davidr-bt commented 1 month ago

In our case, the culprit was a setting of ASG Desired = 1, where we had the described unwelcome behavior. It appears we did not have this behavior with Desired = 0

aws / containers-roadmap

[EKS] [bug]: auto-scaling group ends up in a bad state after `kubectl delete node` #1811

Community Note