Add a taint to the node before draining.

nclaeys commented 2 years ago

This makes it possible to detect that the pods are killed due to spot termination for example. Without tainting the node, you have no way of knowing why a job/pod was killed.

We use this information to enrich our error messages shown to the user.

maksim-paskal commented 2 years ago

@nclaeys thanks for contributing. I appreciate this. I just want to know what exactly this change can help you. Taint nodes with TaintEffectNoSchedule will have no effect on pods that already run on this node. How can this taint help you to identify pod termination reasons?

nclaeys commented 2 years ago

Hi Maksim,

I want to thank you providing the node-termination-handler, it looks very good and from initial testing, does a great job. It was one piece we were still missing in Azure vs our AWS offering (https://github.com/aws/aws-node-termination-handler).

Well our platform basically monitors batch jobs. We have kubernetes operators to manage them. If a node gets preemted and all the pods get drained, they will receive a sigterm and some of them will fail because of this. We want to show to users, the exact reason why the jobs failed. We do this in our operator by mapping different failure states of our pods/containers to a more human friendly error (translating mounting errors, iam errors, pod pending,...). As part of this we lookup information of the node, and if it has been tainted with preemtion, we know that that is why the job was killed.

It is important to give users this information, since it allows them to decide whether the workloads can run on spot (cheaper) but sometimes they may fail due to preemption. If that is too big an issue, they can run their jobs only on on-demand nodes.

So actually the most important thing is the node being annotated with a specific key, not the effect of the taint. It helps to distinguish node drains due to upgrading a kubernetes version for example from preemption of spot nodes...

I hope this gives you some more insight, if you have any more questions just shoot.

maksim-paskal commented 2 years ago

Thanks for the details. Sounds great. In your case this information can be stored as part of node annotation, but node taint also will be a great option.

Niels, what about adding new configuration flags -taint-node=false and -taint-effect=NoSchedule - that will handle adding taints and taint effect before node draining?

nclaeys commented 2 years ago

It makes sense to add taint-node parameter and indeed. I could add the taint-effect, but to me this looks less useful, any specific reason why you want people to specify that? Right after the taint you will cordon and drain all the pods anyway so the job will be killed anyway.

maksim-paskal commented 2 years ago

It will be great to have customizable taint-effect - it will allow for some users terminate also DaemonSets with NoExecute taint effect. For default it will be NoSchedule

nclaeys commented 2 years ago

Alright fine, I will update the PR to add both. Thanks for the feedback!

maksim-paskal commented 2 years ago

@nclaeys great job! Thanks for contributing. Your changes released https://github.com/maksim-paskal/aks-node-termination-handler/releases/tag/v1.0.0

maksim-paskal / aks-node-termination-handler

Add a taint to the node before draining. #24