kubernetes-sigs / karpenter

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
Apache License 2.0
563 stars 185 forks source link

Borrow Mechanism Between NodePools in Karpenter #1703

Open woehrl01 opened 1 week ago

woehrl01 commented 1 week ago

Summary: Introduce a "borrow" mechanism between NodePools in Karpenter, inspired by Kueue's Cohort borrowing functionality. This feature would allow a NodePool to borrow CPU cores (or other resources) from another NodePool, optimizing the utilization of reserved compute instances and improving overall flexibility.

Background: Karpenter significantly improves workload efficiency by automatically provisioning nodes that meet pod requirements and deprovisioning nodes when they are no longer needed. However, in environments with multiple NodePools (such as those containing both on-demand and reserved instances), it would be beneficial to allow NodePools to share underutilized resources. This would enable more efficient use of reserved instances, ensuring that the resources already allocated are fully leveraged before new nodes are provisioned.

Proposed Solution: Implement a feature similar to Kueue’s "borrow" mechanism, where NodePools can borrow unused CPU cores or other resources from other NodePools. For translation, a Kueue "ClusterQueue" can be seen as equivalent to a NodePool in Karpenter.

  1. Cohort of NodePools: Allow grouping of NodePools into a cohort. NodePools in the same cohort should be able to borrow resources (e.g., CPU cores) from each other.

  2. Borrowing Semantics: When a NodePool runs out of its allocated resources, it should be able to borrow unused resources from another NodePool in the same cohort:

    • Karpenter should attempt to provision workloads within the assigned quota of a NodePool first.
    • If resources are exhausted, it should try to borrow from unused quota in other NodePools within the cohort.
    • A NodePool can only borrow resources it is configured to use, and borrowing should be limited by predefined thresholds (similar to Kueue’s borrowingLimit).
  3. Resource Prioritization: Borrowed resources should prioritize workloads within nominal quotas, ensuring that borrowing is a secondary measure. If multiple workloads require borrowing, prioritize based on workload priority or creation timestamp, similar to Kueue's approach.

Reference: For details on Kueue’s borrowing semantics, refer to the Kueue documentation.

Use Case: This feature would be especially useful for environments with a mix of on-demand and reserved instances. For example, a NodePool running on reserved instances could lend unused CPU cores to an on-demand NodePool, reducing the need to provision additional on-demand nodes when reserved capacity is available. This would result in significant cost savings and better resource utilization.

k8s-ci-robot commented 1 week ago

This issue is currently awaiting triage.

If Karpenter contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.