After the work on the Mid-tier resource #1361 and node peak prediction #1385, the koordinator is able to estimate the future peak usage of the node and the pods in different priority classes. In some scenarios, we hope to submit more pods than the node allocatable where all the pods can be long-running and might have the same priority class. It requires a mechanism for supporting the over-commitment of Prod resources with the capability of peak prediction.
There are some works to support the Prod overcommitment:
[ ] Define the API for Prod overcommitment.
[ ] Implement the resource overcommitment scaling mechanism.
[ ] Implement the resource calculation and updating in the koord-manager.
[ ] (optional) Enhance the scheduler with the peak prediction.
[ ] (optional) Improve the node prediction in koordlet with a more conservative estimation and less ledger lagging.
After the work on the Mid-tier resource #1361 and node peak prediction #1385, the koordinator is able to estimate the future peak usage of the node and the pods in different priority classes. In some scenarios, we hope to submit more pods than the node allocatable where all the pods can be long-running and might have the same priority class. It requires a mechanism for supporting the over-commitment of Prod resources with the capability of peak prediction.
There are some works to support the Prod overcommitment: