ICLDisco / parsec

PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core heterogeneous architectures. PaRSEC assigns computation threads to the cores, GPU accelerators, overlaps communications and computations and uses a dynamic, fully-distributed scheduler based on architectural features such as NUMA nodes and algorithmic features such as data reuse.
Other
47 stars 17 forks source link

Lock in LFQ scheduler with DAGUE_HOOK_RETURN_AGAIN #87

Closed abouteiller closed 7 years ago

abouteiller commented 8 years ago

Original report by Reazul Hoque (Bitbucket: rhoque_icl, ).


To resolve anti-dependence in DTD, we thought of using DAGUE_HOOK_RETURN_AGAIN from prepare_input of a task that will write on a data having readers count of > 0.

The test generates a sequence of tasks(with the following operation on the same data) like below:

W->R->R->R->R->R->W

We generate 4 sequences like this. Think of a 2x2 matrix and for each tile we generate the above sequence. So at startup we have 4 W tasks on the 4 tiles (1 W task for each tile).

The LFQ has local hbbuffer of size 4. We ran with a single thread. The hbbuffer had all of it's cell filled with the startup W task on each tile.

After the first W task on the first tile execute we discover the subsequent 5 R task and the last W task on that tile. We will try to push all the discovered tasks (5 R tasks and the last W task) in the local hbbuffer. We have just one slot empty in the hbbuffer and because of the order in which we discover task we push the last task in there( which is the last W of the first tile). Point to be noted: all tasks starts with the same priority of 0. The 5 R(Read) tasks on that tile gets in the system queue.

The scheduler then pops the first task in the buffer which is the last W on the first tile. We go to the prepare_input of that task and return with DAGUE_HOOK_RETURN_AGAIN. We then decrease the priority of the task(last W on the first tile) and push it back. While pushing it back we push it in the same spot in the hbbuffer as before.

This steps repeat for the other 3 tiles, and we end up with the last W task on each tile in the hbbuffer. All the intermediate READ tasks on each tile ends up in the system queue. The scheduler then keeps popping the same last W task on each tile from the hbbuffer and we have a lock.

abouteiller commented 8 years ago

Original comment by Thomas Herault (Bitbucket: herault, GitHub: therault).


I discussed with Reazul about this, I will try to summarize my point of view here:

Concerning the bug:

Solutions:

Enrich the scheduler API

Solve the problem at the higher level

I prefer the first approach, even if it requires to update all schedulers, because this leaves the policy to the scheduler, and does not enforce a general behavior.

I can do this first solution quickly, to show how it would look, unless people have a strong different opinion.

abouteiller commented 8 years ago

Original comment by George Bosilca (Bitbucket: bosilca, GitHub: bosilca).


The solution we converged to with Reazul is similar to your first proposal, extend the scheduling API to allow extra hints on how far away the task should be pushed. I advocate for an additional parameter to the schedule and select function that would allow a discussion between the runtime and the scheduler. On select it will return the distance where the scheduler found the returned task, hinting on how much efforts were necessary to find something useful to do. On the schedule it would allow the runtime to hint how far back this task should be pushed. The runtime will then actively use the return value from select to drive it's "return policy" and provide a more meaningful level to schedule.

abouteiller commented 8 years ago

Original comment by Thomas Herault (Bitbucket: herault, GitHub: therault).


Just as a follow up on this, I'm waiting for Reazul's P.R. to see, but I agree on the idea :)

abouteiller commented 8 years ago

Original comment by George Bosilca (Bitbucket: bosilca, GitHub: bosilca).


pull request #35 might be the solution.

abouteiller commented 8 years ago

Original comment by Thomas Herault (Bitbucket: herault, GitHub: therault).


We discussed this issue during the meeting today.

Another approach arose: why not use DAGUE_HOOK_RETURN_ASYNC, keep the task in a waiting list associated with the data, and schedule them when the nb_readers of the data becomes 0. Instead of using DAGUE_HOOK_RETURN_AGAIN, that will kind of force the scheduler to try again and again while there is no indication that the readers are done.

abouteiller commented 7 years ago

Original comment by George Bosilca (Bitbucket: bosilca, GitHub: bosilca).


Completed by pull request #35.