kubernetes-sigs / karpenter

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
Apache License 2.0
538 stars 177 forks source link

Mutating web hook for identifying arch / platform supported by container images of pod #1470

Open msvticket opened 1 month ago

msvticket commented 1 month ago

Description

What problem are you trying to solve?

Easily run pods on the architectures / platforms supported by the images of its containers.

Many applications support running on amd64 and arm64, but not all. So if you want to support several architectures / platforms in your cluster you need decide how to handle this.

For example you could configure Karpenter to add taint to arm64 nodes that then would be added to tolerations in the pods supporting it or to add node affinities for amd64 to pods that doesn't support arm64. In any case there is manual work involved.

This could be solved by implementing a mutating admission web hook that looks up what architectures / platforms supported by the images for the containers in the pod and then modify tolerations and / or affinity accordingly.

The reason this functionality would fit in Karpenter is that configuration needed for such a web hook already is known by Karpenter: taints and labels for the node that the pod should run on. So configuration could be as easy as enabling the functionality in the helm chart. According to the principle of least surprise I suspect this isn't something that should be enabled by the default.

How important is this feature to you?

k8s-ci-robot commented 1 month ago

This issue is currently awaiting triage.

If Karpenter contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
jwcesign commented 1 month ago

Totally agree, we can do so many things in the webhook, like control min non-spot replicas of one deployment

jwcesign commented 1 month ago

I am willing to implement a webhook to handle everything like this.

njtran commented 1 month ago

Hey sorry for the delay. I'm not sure if this is something we want to orchestrate in Karpenter. I can imagine the only case where this should really come up is in a NodePool that's flexible to multiple architectures. It's a very common pattern to prescribe the arch in your pod node requirements and match that to the architecture that's spun up when a nodeclaim is created. This will automatically partition the pods to only run on the nodeclaims with the matching architecture. If your application is multi-arch, then you likely don't care which NodeClaim it runs on either, which means that having a taint/toleration setup would require more of you to add in tolerations to get the default flexible behavior.

So if you want to support several architectures / platforms in your cluster you need decide how to handle this.

I'd like to understand how orchestrating tolerations on your pods is a easier to reason about than using node selectors? You can even use node selectors on a nodepool that might be specific to one arch, knowing that it's only compatible with that one NodePool.

sergii-auctane commented 3 weeks ago

control min non-spot replicas of one deployment

Node affinity weight is still available for that kind of control.