Illumina / pyflow

A lightweight parallel task engine
http://Illumina.github.io/pyflow/
146 stars 44 forks source link

Running dependent tasks on the same node #13

Closed amiraa127 closed 8 years ago

amiraa127 commented 8 years ago

Hi

If we have two tasks A and B, where B depends on A, the scheduler starts B after finishing A. However, in doing this, the scheduler exits the compute node that A was running on and requests a new node for B. I was wondering if there is a way such that B starts on the same node that A was on without the need to reenter the queue.

One solution for this would be two bundle A and B into a single script C and add C to the task list. However, this way, we do not have independent knowledge of the completion of tasks A. I was wondering if there is a way to do this without bundling.

ctsa commented 8 years ago

This seems pretty specific to SGE or a similar scheduler -- I imagine you're interested in saving on SGE transaction cost per job or queuing time? Supporting this would get pretty tricky given pyflow's current design with limited benefit compared to the "script C" solution. Maybe this case is motivated by another consideration?

amiraa127 commented 8 years ago

Hi Chris,

As you mentioned, this is motivated by saving on SGE transaction cost per job. Let's say for example that tasks P and Q are both dependent on A. If I bundle A and P into C (or A,P,Q into C), Q will start after P which is not optimal. In this case, the "script C" solution cannot take advantage of the parallelism. I understand the difficulties involved in adding this feature. I just wanted to make sure there isn't any quick fix that I'm not aware of.

Thank you for the quick response

ctsa commented 8 years ago

Ok. I'll take PRs on these sorts of issues but there will probably be less attention given to SGE/drmaa going forward. We're already seeing the many-core transition leading to most pyflow use on a single node compared to a few years ago when it was originally developed.

hyjkim commented 8 years ago

Bit late here, but you could have Task A write a status file (eg, .task_A_complete) and have watcher job that periodically checks for the file before launching task C.