Karpenter's current documentation on this is relatively sparse, with the Control Flow section of the disruption docs going into some amount of detail.
It would be good to add a dedicated section to the disruption docs for candidacy. Minimally, I think it should cover the following topics:
What does being a candidate mean for the different disruption types
Consolidation is an interesting case since nodes don't need to be underutilized to be considered candidates. This may be surprising to users, and is important to document since this is surfaced as a metric.
What conditions must be met for a node to be considered a candidate
Document the general set of conditions as well as disruption type specific conditions (e.g. terminationGracePeriod overriding the do-not-disrupt and PDB check for expiration and drift).
Note: the docs are still located in the AWS provider repo, but I'm opening this here since it has do with an upstream concept.
Karpenter's current documentation on this is relatively sparse, with the Control Flow section of the disruption docs going into some amount of detail.
It would be good to add a dedicated section to the disruption docs for candidacy. Minimally, I think it should cover the following topics:
terminationGracePeriod
overriding thedo-not-disrupt
and PDB check for expiration and drift).Note: the docs are still located in the AWS provider repo, but I'm opening this here since it has do with an upstream concept.