Allow partitioned EDF - Githubissues

lucabe72 commented 6 years ago

Currently, partitioned EDF can be used only by disabling the admission control or by creating a root domain per CPU (and this will disable migrations for ALL tasks, not only deadline ones!).

We need some way to allow the usage of partitioned scheduling without breaking other tasks migrations or risking starvation (because we disabled admission control). Basically, per-runqueue admission control, with the constraint that deadline tasks must be affine to one single CPU.

lucabe72 commented 6 years ago

Something like this: https://github.com/lucabe72/LinuxPatches/blob/PartitionedDl/0001-sched-deadline-Partitioned-admission-control.patch

derkling commented 6 years ago

I like the idea :+1:

AFAIU, currently if we have for example three 60% tasks, with also the same period and deadline, on a system with 2 CPUs, CBS will allows all of them. But this does not really grant to meet their deadlines, for example if the three tasks are activated by the same timer. Isn't it?

I'm always wondered if an approach where CBS is a per CPU mechanism, and tasks are admitted and pinned to a give CPU when they are initially created cannot be a better approach? I can see that this would likely results in a system where we can schedule a smaller number of DL tasks, but likely it will results in better guarantees as well as (perhaps) a slightly simpler wakeup code.

But likely I'm missing something...

jlelli commented 6 years ago

AFAIU, currently if we have for example three 60% tasks, with also the same period and deadline, on a system with 2 CPUs, CBS will allows all of them. But this does not really grant to meet their deadlines, for example if the three tasks are activated by the same timer. Isn't it?

On SMP we can only guarantee bounded tardiness. And this seems to be such a case.

I'm always wondered if an approach where CBS is a per CPU mechanism, and tasks are admitted and pinned to a give CPU when they are initially created cannot be a better approach? I can see that this would likely results in a system where we can schedule a smaller number of DL tasks, but likely it will results in better guarantees as well as (perhaps) a slightly simpler wakeup code.

But likely I'm missing something...

CBS is already a per CPU mechanism. On SMP tasks are then migrated between CBS per-CPU "instances". So, if what Luca has in mind turns out to be working, we could relax the affinity setting restriction and maybe perform better in certain situations.

lucabe72 commented 6 years ago

Hi Patrick,

I see that Juri already replied, but I add my 2 cents... :)

On 22 February 2018 at 18:22, Patrick Bellasi notifications@github.com wrote:

I like the idea 👍

AFAIU, currently if we have for example three 60% tasks, with also the same period and deadline, on a system with 2 CPUs, CBS will allows all of them. But this does not really grant to meet their deadlines, for example if the three tasks are activated by the same timer. Isn't it?

Well, SCHED_DEADLINE by default uses the so called "global EDF" scheduling algorithm, which allows tasks to migrate between CPUs / cores (and the CBS algorithm is used to assign deadlines to tasks so that EDF can be used). This algorithm has the property that if the total utilization is smaller than the number of CPUs, then each task is guaranteed to finish with a limited delay respect to its deadline (technically, there is a bound to the tasks' tardiness or lateness). So, deadlines are not guaranteed to be respected but you have the guarantee that they will not be missed by a large amount of time. The kernel admission control relies on this.

So, yes, your example taskset will miss some deadlines, but the kernel accepts it because the difference between finishing time and deadline is guaranteed to have a maximum.

With my proposal, you could configure the kernel to do partitioned EDF (no automatic migrations), so the third task would be rejected but you would have the guarantee that all the deadlines of accepted tasks are respected.

I'm always wondered if an approach where CBS is a per CPU mechanism, and tasks are admitted and pinned to a give CPU when they are initially created cannot be a better approach?

I think there is no better or worse, here... It is a matter of trade-offs: partitioned scheduling is more pessimistic, but does guarantee that all deadlines will be respected; global scheduling does not provide such a strong guarantee by default (you need to run a more complex admission test in user space), but is able to accept more tasks... And in the average case it performs well. Moreover, partitioned scheduling might be more complex to use (you are responsible for assigning the tasks to the correct CPUs / cores... With global scheduling, this is automatic)

So, I think SCHED_DEADLINE should allow to use both the algorithms (and this is where my patch needs improvements)

I can see that this would likely results in a system where we can schedule a smaller number of DL tasks, but likely it will results in better guarantees as well as (perhaps) a slightly simpler wakeup code.

Yes, with partitioned scheduling wakeup (and scheduling, and periodic tick) would be simplified (allowing to reduce some scheduling latencies).

I think this is a feature we need... We just have to design it properly :)

Luca

jlelli / linux

Allow partitioned EDF #6