Open njtran opened 1 year ago
I really like the idea of this issue. This will solve the spread out behavior of default scheduler when there is are continuously added new pods but there is much idle capacity. With the described behavior karpenter would taint some of the nodes with PreferNoSchedule and will cause the scheduler perform bin packing of the new pods on the remaining nodes instead of distributing them across all underutilized nodes. I hope there will be policies or configurations in place that allow Karpenter to identify nodes as disruption candidates even though they are still running some small jobs which can not be evicted.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
Please be sure to handle the use case where a Pod running on a Node adds a "do-not-evict" annotation while it is running. Of course there will be an unavoidable race condition, but it is important to realize that just because the Node is tainted, it does not mean that annotated Pods will not appear on the Node.
It would be good for my use case if there were a way for a Pod to get notified that Karpenter is considering consolidating the node (NoSchedule Taint added) so it can immediately decide to either quit or annotate itself, which will give the Pod a head start in the race and avoid most if not all real-world mishaps.
One way to do this would be via another annotation, such as ok-to-distrupt
or prefer-to-disrupt
or something, that tells Karpenter to send the pod some Signal other than SIGTERM that the Pod can respond to (and by default would ignore) when Karpenter considers the Node a likely consolidation target. This would have to be after the Node is tainted, so that when the Pod quits and is immediately replaced with a new Pod by the Deployment, the new Pod does not get scheduled onto the same Node. We would also want a configurable delay between the taint and notification in step 3 and the actual termination in step 4, so we can be sure to give enough time for the Pod to respond and block termination.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
Description
What problem are you trying to solve? Karpenter has driven disruption of nodes through annotations and processes maintained in memory.
Karpenter should drive disruption by through its own taint mechanism(s) while it discovers and executes disruption actions.
This issue proposes that each node owned by Karpenter will be in one of four states:
Related Issues:
[ ] PreferNoSchedule/NoSchedule when a node is marked as Drifted: https://github.com/aws/karpenter-core/issues/623
[ ] PreferNoSchedule/NoSchedule when a node is marked as Expired: https://github.com/aws/karpenter-core/issues/622
[x] Add A NoSchedule taint when Karpenter begins executing a disruption action: https://github.com/aws/karpenter-core/pull/508
[ ] NoExecute when a node is terminating to cleanup daemonsets: https://github.com/aws/karpenter-core/issues/621
Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment