Open dpiddock opened 2 days ago
This issue is currently awaiting triage.
If Karpenter contributors determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
Description
Observed Behavior: Karpenter restores the
startupTaints
to a node if it is removed too quickly at node startup. This results in a node being unusable. Node also never reaches a ready state, so Karpenter refuses to remove it:Cannot disrupt Node: state node isn't initialized
From AWS CloudWatch logs insights:
Expected Behavior: Karpenter updates the existing taints on a node to remove
karpenter.sh/unregistered=NoExecute
without restoring startup taints removed by other controllers.Reproduction Steps (Please include YAML): This is an unpredictable race condition that is near impossible to reproduce on demand. Might be related to this code: https://github.com/rschalo/karpenter/blob/a652a4aa95dbe92159bb273a3b64ff8837d92660/pkg/controllers/nodeclaim/lifecycle/registration.go#L87
Versions:
Chart Version:
1.0.6
Kubernetes Version (
kubectl version
):Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment