aws / karpenter-provider-aws

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
https://karpenter.sh
Apache License 2.0
6.76k stars 951 forks source link

Scale node storage based on pod ephemeral-storage requests #2394

Open dewjam opened 2 years ago

dewjam commented 2 years ago

Tell us about your request What do you want us to build? Enable Karpenter to dynamically scale the size of block device attached to a node at provision time. The size of blocked device would be based on the sum of ephemeral-storage requests of pods being bin-packed to a node plus some overhead.

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Currently, a node's storage capacity can be defined through the Karpenter provisioner through Block Device Mappings. This works well, but forces customers to define a static value for all instances launched through a given provisioner. Customers would like the ability to dynamically scale storage of the nodes based on the pod workload or by instance-type.

Are you currently working around this issue? This can be worked around by defining Block Device Mappings in the Karpenter Provisioner. These values are static for the given provisioner, however, and cannot be dynamically scaled up/down.

Related issues:

2512

2298

1995

1467

3077

3111

Additional context Anything else we should know?

Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

Community Note

cx-IliyaY commented 1 year ago

is there some progress with it? because the only answer is this link but it's dead now error 404... Block Device Mappings

tzneal commented 1 year ago

Those docs are now here, but there hasn't been any update.

jonathan-innis commented 1 year ago

is there some progress with it?

No current, active progress. But it's something that's considered "v1" scope, which means that we're planning to work on this part of the v1 release for Karpenter. It's definitely within our list of priorities but the maintainer team has been somewhat time-constrained lately and working on some other feature work and stability improvements.

jagadeesh-kancherla-tfs commented 7 months ago

+1

pragmaticivan commented 4 months ago

This would reduce some alerting for NodePools sharing the same StorageClass with static storage values.

2xl - 16xl might require a bit of a drift in storage due to multiple factors, including docker pulls and volumes.

Any chance this would get a bump in priority?

Smana commented 3 months ago

Are there any update on this issue? We currently need to specify a constraint for using nvme instance types. And prepare the RAID0 array.

        - key: karpenter.k8s.aws/instance-local-nvme
          operator: Gt
          values: ["100"]

Otherwise we face issues for some pods not having enough ephemeral disk space which leads to evicted pods with a node DiskPressure error.

zakariais commented 2 months ago

is there some progress with it?

MedAzizTousli commented 1 week ago

Any progress on this?