kubernetes-sigs / karpenter

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
Apache License 2.0
537 stars 177 forks source link

Set System Reserved as percentage #1595

Open rrmarq opened 2 weeks ago

rrmarq commented 2 weeks ago

Description

What problem are you trying to solve? Dynamic overhead calculation for O.D resources reservation.

How important is this feature to you? After having some problems with instances running out of capacity until kubelet get killed by the os, letting them with a “zombie node”, we started to reserve resources for the O.S calculating a percentage based on the instance size and setting the value in the userData. Setting the value in the userData might make Karpenter calculate the instances resources wrongly ending up and bad instance selections. An alternative is setting this value in nodepool.spec.template.spec.kubelet, however, using this options it is only possible with static values. As we have nodepools with different sizes, we would like to set the value overhead as percentage, without having to do it via userData.

k8s-ci-robot commented 2 weeks ago

This issue is currently awaiting triage.

If Karpenter contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
bnu0 commented 1 week ago

We'd love to be able to make these a percentage. We use reserved memory to ensure sufficient burst capacity across our fleet, and achieve higher density by keeping requests accurate. The static nature of these settings means we cannot allocate instances fully dynamically, as the values do not make sense for significantly larger or smaller nodes.