Open mikouaj opened 4 months ago
On the case we don't want impacting the scheduler, bost only the limit is not sufisant ? As opposite, for pod that required a bost, updating the request, will impact the scheduler, but that is what is expected.
As resume, not sure too understand on which usecase you want to impact the schedulers but not scale nodes or disable a bost if the pod cannot be scheduled.
Pheraps will be greate to add an option to increase in percentage only the limit, eventually an others to remove limit during the bost.
@yyvess the use case we try to solve is as follows: 1) The POD resource requests are increased per StartupCPUBoost config 2) The scheduler is not able to find a suitable nodes (no capacity) and the POD is unschedulable 3) (autoscaler path) The Cluster Autoscaler kicks in and provisions new nodes to accommodate boosted PODs 4) (autoscaler path) The PODs are scheduled on a new nodes 5) (autoscaler path) The PODs CPU requests are reverted back to original values 6) (autoscaler path) After some time the Cluster Autoscaler considers nodes as underutilized (as bigger CPU resources were reverted back) and triggers scale-in action 7) (autoscaler path) The PODs are being evicted from the nodes and rescheduled somewhere else. We start with point 1). This may even repeat in a loop.
With this feature we aim to solve around point 2) - to give a user possibility to decide if CPU boosting can lead to unschedulable PODs.
@mikouaj I understand that point 2 can be an issue. As you explain to solving this issue isn't easly. Until it to avoid this case you can allowing to only bost the limit value (and don t touch the request) that should not impact the scheduler and avoid point 2.. But actually is not possible to bost in ÷ only the limit.
Ps: It can be also interesting to allow during the bost to remove the limit value to to use all node cpu during the bost.
@yyvess I like the idea of removing limit value during the boost. It sounds obvious now but I have never though about it before, many thanks! I will create a feature to introduce that possibility in a config driven way.
For the resource requests, boosting them is needed to actually guarantee the resources - although it comes with all of the described challenges. Addressing this can be tough but I believe it is still doable.
Community Note
Description
The capacity aware boosting will make the CPU resource boost conditional: the mutating webhook would try to verify if the given POD, with a boosted resources, would be schedulable on a cluster:
This feature requires scheduling algorithm simulation, including node selection and resource check. There is no API for this and real scheduling algorithm is a complex task, so some sort of simplification to produce "good enough" results is needed.
References