kubewharf / godel-scheduler

a unified scheduler for online and offline tasks
Apache License 2.0
497 stars 75 forks source link

Why another scheduler? What purpose of this project? #26

Open Cdayz opened 8 months ago

Cdayz commented 8 months ago

Yeah, my question is pretty simple, why you started to built new one k8s scheduler?

There are many different production-grade solutions like:

They offers the same functiionality and much more extra things that are already built-in.

What purpose of this project?

NickrenREN commented 8 months ago

@Cdayz The main goal of this project is to provide an unified scheduler for online and offline workloads, so that it will be easier to do colocation and improve resource utilization and resource elasticity.

Volcano is a offline scheduler, and yunikorn only provides kubernetes adaptor.

In Bytedance, the cluster scale is very large (20k nodes, 1000k pods in one single cluster) and the business scenarios are complex, it is difficult to use existing scheduler directly and the development effort (based on them) is not acceptable.

Cdayz commented 8 months ago

@NickrenREN, can you please describe the difference between online and offline workloads, for example, what you mean by that?

I am asking this question because, according to my understanding, an online scheduler is a scheduler that does not know when a new task arrives or when an already running task will finish. A volcano can be used in this environment as far as i know.

Maybe I am wrong, and if I am wrong, I apologize for wasting your time. However, I think it is important to clarify the goals of this project and possibly write a decision record with the pros and cons of other solutions, along with some clarification as to why this one is necessary.

NickrenREN commented 8 months ago

@Cdayz generally speaking, online workloads are SLA, latency sensitive workloads, such as micro-service workloads, RPC services, and offline workloads are mostly throughput oriented and care more about job completion time, such as Hadhoop batch apps and ML training tasks...

They care about different metrics, the scheduling requirements are different, for example: Hadhoop batch apps need high scheduling throughput (1k pods per second in our prod env), and ML training tasks need Gang, Job level affinity (all tasks in one job need to be scheduled into one Tor or one network segment, any Tor or network segment is ok but can not cross) and some other complex features...