camsas / firmament

The Firmament cluster scheduling platform
Apache License 2.0
415 stars 79 forks source link

How can Firmament support multi-dimensional resources scheduling ? #73

Closed NickrenREN closed 5 years ago

NickrenREN commented 5 years ago

Hi, guys, I met an understanding problem when i read the Firmament osdi paper. That is: examples listed there seem to show that the tasks(kubernetes pods) and the machines(M0,M1...) are 1:1 mapping. If there are more tasks, they will get unscheduled. (Please correct me if i am wrong).

If so how can we handle the actual situtation that one machine can have multiple tasks running on it ? Should we change the capacities of the arcs from machine nodes(Mm) to sink node so that more than one flow can pass the arcs ? Or should we split the machine nodes into some virtual nodes that one virtual node can only have one task in it ?

And also, if our tasks request multi-dimensional resources, how can we set the capacity of the arcs from machine nodes to sink node or how can we split the machine nodes ? In other words: how can we consider the overall resource requirements and set the capacities and costs of the arcs ?

Hope to receive feedbacks from you guys. @ms705 @ICGog Thanks~

gabrielecastellano commented 3 years ago

Hello, I have some questions similar to the ones of this thread, and it is unfortunate to see that it has been closed without any explaination.

Particularly, I wonder how firmament supports multi-dimensional resources (e.g., cpu and memory in Kubernetes) if we consider them as hard constraints for tasks to be scheduled.

Even for the simpler case when just cpu is considered, it is not clear to me how the cpu load of a task (converted in a flow, I suppose) can be forced to entirely flow towards a compute node, instead of being possibly split by the solver across multiple arcs (e.g., half flows to an actual node, half flows to the unscheduled aggregator). Doesn't preventing this require some additional constraints that are outside of the min-cost flow formulation? If this can instead be achieved by formulating the policy properly, I kindly ask you to point me to such policy formulation.

Thanks guys for this amazing work and for your patience in addressing my questions Gabriele

ms705 commented 3 years ago

Each task always creates one unit of atomic (indivisible) flow; this is necessary for the min-cost max-flow formulation of the scheduling problem to work.

Multiple resource dimensions are handled by creating and removing arcs in the graph in reaction to changing available resources on machines. See e.g., §7.3.1 (p165) here.

gabrielecastellano commented 3 years ago

Hello Malte, Thanks for your quick response.

I will move to study the multi-resource formulation only after I am sure I correctly understood what concerns the simpler case with only one-dimensional constraint.

Since each task always creates one unit of flow, therefore Firmament assumes that all the tasks are equal in terms of resource consumption. Correct? Let suppose we have three tasks: T1 requires 2 CPU T2 requires 1 CPU T3 requires 1 CPU And one machine M1 featuring 4 CPU. When modeling the problem, if I understood correctly, I have to decide how much CPU the unit of atomic flow stands for. To avoid T1 to be split, I should consider the unit = 2 CPU. Therefore the min-cost flow problem will feature Three source nodes each with 1 unit, as you said, and 1 Machine node with capacity = 2. As a result, the solver will not be able to schedule T1, T2, and T3 on M1. (only two of them fit the flow formulation).

This is equivalent to consider the capacity as "number of supported tasks", rather than a more "granular" cpu value. Is this understanding correct?

Thanks