pharmbio / sciluigi

A light-weight wrapper library around Spotify's Luigi workflow library to make writing scientific workflows more fluent, flexible and modular
http://dx.doi.org/10.1186/s13321-016-0179-6
MIT License
334 stars 57 forks source link

support for parallel_scheduling? #8

Closed mjmonroe closed 9 years ago

mjmonroe commented 9 years ago

When I use a luigi.cfg file with parallel_scheduling turned on and run example1.py the log file shows:

2015-09-08 13:48:00 |     INFO | --------------------------------------------------------------------------------
2015-09-08 13:48:00 |     INFO | SciLuigi: MyWorkflow Workflow Started
2015-09-08 13:48:00 |     INFO | --------------------------------------------------------------------------------
2015-09-08 13:48:00 |     INFO | Scheduled MyWorkflow(instance_name=sciluigi_workflow) (PENDING)

Nothing else happens. It appears that there is a deadlock condition somewhere that prevents MyWorkflow from being scheduled. I have not been able to figure out what the issue is.

My luigi.cfg file:

[core]
parallel_scheduling=True
workers=2
local_scheduler=True
samuell commented 9 years ago

Many thanks for reporting!

I'll try and see if I get the same result.

Do you have some simple example workflow that has this behavior, or does it happen e.g. for the examples included in the examples folder too?

Also, did you use the 0.9.0 release, or the tip of the master branch?

mjmonroe commented 9 years ago

The example1.py that I was referring to is the one in sciluigi's example directory. I also tried the example3_workflow.py, and it did the same thing. If I set parallel_scheduling=False, then it works just fine.

samuell commented 9 years ago

Thanks, will check!

samuell commented 9 years ago

I'm having my suspicions, but as long as I don't know, let me not speculate too much :)

But what I find so far with some pdb debugging, is that it stops and waits the second time it reaches this line in luigi's worker.py.

Hope to find a way to figure out why.

samuell commented 9 years ago

@mjmonroe In the meanwhile, while trying to figure this out, I assume you know that parallel scheduling does not speed up the actual execution at all, only the scheduling phase? (As also discussed here).

Thus, this would be mostly relevant if you have extreme numbers of tasks.

In fact, we have ourselves never thought we needed to speed up the scheduling phase, even though we have hundreds of tasks sometimes.

But you might have different needs? :)

mjmonroe commented 9 years ago

Your right, I do not think I will need to use parallel scheduling, only multiple workers. The thread really helps with the definitions of the two.

samuell commented 9 years ago

Closing this as a "won't fix" for now, unless someone finds a way to solve this easily, as for us, this is not a priority, and there seem to be some aspects of our approach, making this hard.