Open Kausheel opened 3 years ago
Another use-case is to define Cluster-level default constraints for PodTopologySpread in scheduler. As per doc https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/#cluster-level-default-constraints
AWS should make it as default behaviour in EKS cluster.
apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
profiles:
- pluginConfig:
- name: PodTopologySpread
args:
defaultConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
defaultingType: List
I would love to use this for enabling bin packing like explained here: https://kubernetes.io/docs/concepts/scheduling-eviction/resource-bin-packing/
Upvote.
Trying to use EKS and achieve bin packing is hard without changing Scheduler Behavior to favor MostAllocated
.
Note that this feature is supported to some extent in Azure and is supported for the use case of Scheduler Scoring Strategy: MostAllocated in GKE by using the autoscaling profile (note this is an assumption on my part, GKE does not explicitly document what this setting does under the hood) . Adding this ability would help EKS users gain parity in that sense.
I would be fine with having a setting like GKE has, this would solve my use case. It probably does not solve every use case out there, but I can understand if the AWS EKS team feels reluctant to allow changing the whole configuration.
Imagine this, if this feature can be opened for all EKS users, that would save a lot of time for them. Let's assume it will take one week per person to workaround this via custom kube-scheduler, if there are 1000 users need this, it will cost 7000 days, that would be a whole life of one person.
With Kubernetes v1.24
the DefaultPodTopologySpread
feagture graduated to GA https://github.com/kubernetes/kubernetes/pull/108278. Without this we have not way to use (resp. configure) it on EKS clusters.
Same here. We need this feature to enable resource bin packing for cost saving https://kubernetes.io/docs/concepts/scheduling-eviction/resource-bin-packing/
@AnhQKatalon, run scheduler yourself with needed settings + patch pods to use that scheduler with kyverno for example :) Could be done in couple hours.
@AnhQKatalon, run scheduler yourself with needed settings + patch pods to use that scheduler with kyverno for example :) Could be done in couple hours.
Yeah, I am doing the workaround this way. Appreciate your help. But it should be great if EKS supports changing the scheduler configuration officially.
As others mentioned, this is required to set default pod topology constraints on the cluster, as per: https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/#cluster-level-default-constraints. There would be other uses cases, I am sure of it.
There are workarounds, of course, but this seems like a core thing to do, in order to make the life of EKS users easier. I thing this is a MUST.
This would be very helpful for the same reasons mentioned by other above:
NodeResourcesFit
The suggestion of rolling your own Scheduler is not appealing because EKS might have bolted on their own tweaks/modifications to get the scheduler to work right in AWS and then we'd loose all of that. And then there's maintaining it. I get that modifying the EKS blessed set of configuration can lead to instability - but if I want to modify just a few settings I should be allowed to do that with the understanding it could break scheduling on my cluster. Upstream k8s allows it and it's useful.
If not possible to add customization in kube-scheduler,
can we think about this feature like GKE, node groups will have option to scale with the mostAllocated
like strategy like GKE have autoscale profile optimize-utilization
?
@subhranil05 This is not an alternative solution. Scaling Node Groups can only achieve bin-packing during the event of scaling up. Kube Scheduler customization is necessary for in-place, proactive bin-packing.
Can somebody take a look and consider including this issue to kanban board? It seems that demand is still valid in 2023 as issue is active for more than 2 years. Of course we we can self-manage additional kube-scheduler but it's counter intuitive to subscribe for aws-managed EKS controlplane with self-managed controlplane components (additional kube-scheduler).
CC @tabern @mikestef9
This would be very useful for my EKS clusters. I want to be able to set sensible defaults without having to run my own scheduler.
I would love to see this as well too support bin packing at scheduling.
Do it for the environment folks!
I want to use bin packing with karpenter for job workloads. So karpenter can scale down empty nodes after a scale up. Instead of spreading the pods across all nearly empty nodes they should be packed on some full nodes, to enable karpenter removing empty nodes after the last job running on it completed.
Assuming AWS may not prioritize this for awhile at the current rate, I think an example deployment of a custom scheduler with MostAllocated
enabled for binpacking would benefit everyone here (as suggested in https://github.com/aws/containers-roadmap/issues/1468#issuecomment-1645021158) - despite the burden it puts on 1) cluster admins to maintain control plane infra in-step with EKS versions, 2) Pod creators to ensure the custom scheduler is used. A Kyverno / Gatekeeper / custom webhooks potentially helping with the latter.
https://kubernetes.io/docs/tasks/extend-kubernetes/configure-multiple-schedulers/
Is a starting point, but if anyone has manifest samples that have been tested for a binpack configuration everyone wants that'd be appreciated. If I get to this at some point will share.
In some clusters, I've seen something like this provided:
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
clientConnection:
kubeconfig: /var/lib/kube-scheduler/kubeconfig
profiles:
- schedulerName: default-scheduler
pluginConfig:
- args:
scoringStrategy:
type: MostAllocated
name: NodeResourcesFit
plugins:
score:
disabled:
- name: "NodeResourcesBalancedAllocation"
enabled:
- name: "NodeResourcesFit"
weight: 5
We ran into this this same issue and had to setup a custom scheduler to implement bin-packing. It's the same kube-scheduler image with a MostAllocated
scoring policy as suggested above. Blog has more details about how we dealt with overprovisioning and system workloads and rollout to all pods. This section has the specific scheduler config.
We were able to achieve this in GCP by using the optimize-utilization
setting in GKE, but for Azure AKS, we still have to use this secondary scheduler with custom scoring policy.
How is this API not supported yet? Is there any plan to support this soon? It's part of the standard Kubernetes service but there's no way to use on EKS? This really doesn't make EKS very usable in our case. All of the major packages are assuming that the standard APIs are available.
Same as @MattLJoslin said... we really need it as well
I think being able to run the scheduler in MostAllocated
mode would make the Karpenter use case even more compelling.
Any updates on this?
+1 for being able to add a pluginConfig
about PodTopologySpread
as well as a MostAllocated
scoring policy.
Do it for the environment folks!
And this!
Hint in the meantime: you can use the following AWS managed image to provision the scheduler yourself, without the need to self-manage the image: https://gallery.ecr.aws/eks-distro/kubernetes/kube-scheduler
@woehrl01 That's a viable workaround but then users have to manage the scheduler component themselves as well as update every workload to target it in the pod spec. As a managed Kubernetes service it'd be ideal if these options were exposed as configuration to the user instead.
@jukie I'm not arguing against that this would be a nice addition. I just wanted to mention a solution which won't require you to wait more than 3 years. Just wanted to share this to newcomers which maybe are having troubles with self maintaining this image, etc.
Totally agree, and thanks for sharing!
Community Note
Tell us about your request What do you want us to build?
It would be great if EKS allowed users to configure the Kube Scheduler parameters. This is a Control Plane component, so users don't have access to this by default. Exposing the Kube Scheduler configuration either via AWS APIs or via the KubeSchedulerConfiguration resource type would be a significant advantage for EKS users.
Which service(s) is this request for? EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Use cases for this might include switching from equal Pod distribution to a binpacking approach, which optimizes cost effectiveness. There are many other Scheduler parameters which users might want to tweak themselves.
Are you currently working around this issue? Implementing custom Kube Schedulers. This is not ideal, since it requires operational overhead in maintaining and updating the custom Kube Scheduler. It may also require using tools like OPA to insert custom schedulerName fields into the target Pods, which is yet another burden on the user.
Thanks!