aws / karpenter-provider-aws

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
https://karpenter.sh
Apache License 2.0
6.8k stars 957 forks source link

Ability to prefer generation over price #6721

Open InsomniaCoder opened 2 months ago

InsomniaCoder commented 2 months ago

Description

What problem are you trying to solve?

Hi, In our recent migration to Karpenter, we have observed that for our generic x86 nodepool we have, Karpenter prefers m5(a) types over the newer m6 generation.

I understand that this is correct from the logic that it will select a cheaper type.

However, our users noticed a performance degradation since we moved to Karpenter as the nodes were mostly m6i.4xlarge before, and based on their auto-scaling set up, it ends up creating more pods. (based on the documentation m6 is 15% more efficient than m5)

so, in this case, what I envision is an ability to prefer price/performance or prefer newer generation rather than pure base-price

jigisha620 commented 2 months ago

HI @InsomniaCoder, If they want to specifically have m6 instances, they can configure their nodepool requirements for instance-family -

- key: "karpenter.k8s.aws/instance-family"
          operator: In
          values: ["m6"]

Does this not work for them?

stevehipwell commented 2 months ago

You could also use the Gt operator to set a baseline.

- key: karpenter.k8s.aws/instance-generation
  operator: Gt
  values: ["5"]
InsomniaCoder commented 2 months ago

Hi @stevehipwell , @jigisha620 ,

thanks for the answer, that's actually what I have done for a workaround. basically, creating another node pool that has the same configuration with higher weight and specify

- key: karpenter.k8s.aws/instance-generation
  operator: Gt
  values: ["5"]

but this means that in every node pool that I have (ARM, x86, GPU, etc) I will need to do the same if I want to prefer 6th generation than 5th because of cost-performance.

applying in every base nodepool with

- key: karpenter.k8s.aws/instance-generation
  operator: Gt
  values: ["5"]

will result in limitation if the 6th generation runs out we could face difficulty on pod scheduling.

I'm not sure if I explain the issue well. but IMO, it looks like if 6th generation is preferred (in almost all cases, AWS recommends newer generation right?), using only pure cost per unit will make Karpenter select 5th generation in this case.

stevehipwell commented 2 months ago

@InsomniaCoder AFAIK this is currently the required pattern in Karpenter when you want to specifically chose an implementation as preferred over the default cost based ordering. We have to do this for AMD64 instances so they're prefered over ARM64 when no specific architecture is provided, and so on-demand instances are prefered over spot (this is similar but not exactly the same).

InsomniaCoder commented 2 months ago

Totally understand the reason behind this. What I want to bring to this discussion is mostly to see if anyone has reported this issue of price/performance.

This was triggered as we were running our fleet with node group that is fixed with m6i.4xlarge after we have migrated to Karpenter, our user reports significant performance degradation resulting in their autoscaler being spiky. after we checked, we saw that the fleet is consisting of majority m5i.4xlarge

Machine per machine is, m5 is cheaper, that's true. though, with less performance, The user could end up scale their pods more ending up with more machine than with m6

We would be fine with this "workaround" of having two exact same nodepools with 6th generation only as the main priority. but wondering if the idea is reasonable and if it's worth looking at some longer terms solution for this.