koordinator-sh / koordinator

A QoS-based scheduling system brings optimal layout and status to workloads such as microservices, web services, big data jobs, AI jobs, etc.
https://koordinator.sh
Apache License 2.0
1.27k stars 316 forks source link

[proposal] koord-descheduler: Optimize the reassessment of single-node resource left #2119

Open zwForrest opened 3 weeks ago

zwForrest commented 3 weeks ago

What is your proposal: Currently, during eviction in rescheduling, only the total cluster resources are considered to determine if they are sufficient for the pod being rescheduled, without taking into account whether single nodes have the required resources.

Why is this needed: Prevent the scenario where other nodes become hotspots after rescheduling.

Is there a suggested solution, if so, please add it:

songtao98 commented 3 weeks ago

Nice proposal! koord-descheduler re-uses the NodeFit pre-check logic of Kubernetes Descheduler's DefaultEvictor. This is not enough in all real cases. Other optimization to NodeFit includes like TopologySpreadConstraints and so on. We'll continue to improve this.