Limitation of node number per provisioner

sergkondr commented 1 year ago

Tell us about your request

It would be nice to have the ability to limit the number of nodes created by certain provisioner. For example:

  limits:
    nodes: "6"

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

Let's say our application spawns pods for some tasks, and each pod requires a separate node. Now it limits by the max size of the node group. All these nodes are Spot instances, and there is a problem that sometimes there are no instances of the current family in the region, so we use different instance families: m4, m5, m5a, r5, r5a, etc. These instances could have a different amount of CPU and mem.

It would be nice to have the ability to limit the number of nodes by their count, not by their resources. It is clear that we have max pods in our app, but it is a bit of a synthetic example.

Are you currently working around this issue?

We use limits.resources.cpu with approx number of CPUs, but it is not accurate and not transparent a little bit.

Additional Context

No response

Attachments

No response

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

ellistarn commented 1 year ago

Can you explain the use case of why you care about the number of nodes? It's a tricky metric, since nodes are vastly different sizes. You could potentially pay way more with a small number of massive nodes.

sergkondr commented 1 year ago

Can you explain the use case of why you care about the number of nodes? It's a tricky metric, since nodes are vastly different sizes. You could potentially pay way more with a small number of massive nodes.

I've been thinking for a while, and it looks like you are right. The only reason to think in the node category is a habit of thinking about servers in data centers or on-prem environments.

But anyway, I think it is a nice-to-have feature, maybe someone will implement it in the future.

cest-pas-faux commented 1 year ago

I have a use-case for this, if I buy a specific amount of reserved instances and I want a provisioner with only the related instance type and the number of nodes, it will be easier to read/write instead of having to read the CPU amount * the nodes count.

That's more a nice to have than a really fundamental feature.

ellistarn commented 1 year ago

I'm curious -- is there a reason you don't use an EKS managed node group for your RI? If you're already paying for the instances, is there a reason to not have them online and ready to go?

cest-pas-faux commented 1 year ago

Sorry for the formulation, we went for savings plans instead a few weeks back, but at that time we were thinking about RI and I thought that would be an easy way to write directly limits.nodes: X in the provisioner.

We are a big company with multiple accounts so if a RI is not used in one account, another one will benefit of the cost reduction, so having all of them up is not that important.

runningman84 commented 1 year ago

One usecase for limiting number of nodes could be licensing… maybe you only paid for a max number of nodes with some specific agent for example for monitoring…

sidewinder12s commented 1 year ago

Also if you have large scale/complicated IP addressing like Custom Networking with secondary subnets, you may want to limit node count per host/primary AZ to ensure you always have IP addresses available.

gazal-k commented 1 year ago

Just my 2 cents: Considering Karpenter could be used to provision a wide range of instance types of various sizes and resource ratios, I think specifying number of nodes could be somewhat counter intuitive. We of course used to specify min & max node numbers with nodegroups / ASGs where it made sense. To leverage RIs or capacity reservations, would adding comments to indicate number of nodes help?

  requirements:
  - key: karpenter.sh/capacity-type
    operator: In
    values:
    - on-demand
  - key: node.kubernetes.io/instance-type
    operator: In
    values:
    - m6i.2xlarge

  limits:
    resources:
      cpu: "80" # 10 instances * 8 vCPU
      memory: 320Gi # 10 instances * 32Gi

sidewinder12s commented 1 year ago

Combining restricted requirements + a node count limit might be easy enough to manage.

At scale, using comments to denote instance sizes/classes/resource sets breaks down really quickly/is immediately out of date and is generally a pain to maintain. Also, many of the reasons for wanting to restrict based on node count have nothing to do with resourcing and everything to do with either the physical node count and/or IP/networking limits that are also disconnected from cpu or memory sizing.

gazal-k commented 1 year ago

Ah, yes. Apologies, I hadn't noticed your earlier comment about IP addressing. I concede that it's not a problem that's addressed by existing resource requirements.

github-actions[bot] commented 1 year ago

Labeled for closure due to inactivity in 10 days.

cest-pas-faux commented 1 year ago

Up

jonathan-innis commented 1 year ago

you may want to limit node count per host/primary AZ to ensure you always have IP addresses available

@sidewinder12s How would you achieve this if each instance type can have a different number of ENIs and a different number of IPs would be allocated for each?

sidewinder12s commented 1 year ago

At least in our case, our issues were in a large batch environment where pod density per node was not too bad, so we could generally allocate X IPs per node (vs lots of small pods where we might hit the per ENI IP assignment limits).

We're also using custom networking settings with the aws-vpc-cni to control IP usage, though this increases API Calls against AWS EC2 APIs.

zmpeg commented 1 year ago

From a licensing standpoint this would be extremely useful feature. For example: an org purchases 3 licenses, would be nice if karpenter could scale the cluster by replacing a smaller node with a larger one when workloads are added, keeping at 3 license limit. In our case the cost of the licenses heavily outweigh the cost of the nodes.

yr8sdk commented 9 months ago

Max nodes could benefit the total cluster resource utilization. Ideally, I would like to schedule most of my pods on the least number of nodes ( depends of HA of course) to have simple control over the pods memory limit and since there isn't CPU limit already, more pods could use the unutilized CPU.

k8s-triage-robot commented 6 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

cest-pas-faux commented 6 months ago

/remove-lifecycle stale

cest-pas-faux commented 6 months ago

Better explanation here : https://github.com/kubernetes-sigs/karpenter/issues/745

myaser commented 6 months ago

we have a similar need (but as a global limit) which is to limit the max number of nodes per cluster in our setup; we only assign a single IP to the node (always). and we have limited pool of IPs. so the most straightforward way to do this is by having a global limit on number of nodes. similar to the flag --max-nodes-total on cluster autoscaler

this was also explained here https://github.com/aws/karpenter-provider-aws/issues/4462

jukie commented 6 months ago

I've started on adding a global with #1151 but still need to test

Bryce-Soghigian commented 6 months ago

/lifecycle frozen

Bryce-Soghigian commented 6 months ago

/assign @jukie

jukie commented 6 months ago

RFC here: https://github.com/kubernetes-sigs/karpenter/pull/1160

benben commented 1 month ago

similar use case here: limited set of preallocated IPs and I want to make sure we never provision more instances then we have IPs. These IPs are static in the sense that they are communicated to customer upfront so they can allow our service in their firewalls.

I'd expect karpenter to honor this limit and if more pods are coming, it would consolidate to larger instances to keep the number at the limit.

kubernetes-sigs / karpenter