Closed mjmonroe closed 9 years ago
Many thanks for reporting!
I'll try and see if I get the same result.
Do you have some simple example workflow that has this behavior, or does it happen e.g. for the examples included in the examples folder too?
Also, did you use the 0.9.0 release, or the tip of the master branch?
The example1.py that I was referring to is the one in sciluigi's example directory. I also tried the example3_workflow.py, and it did the same thing. If I set parallel_scheduling=False, then it works just fine.
Thanks, will check!
I'm having my suspicions, but as long as I don't know, let me not speculate too much :)
But what I find so far with some pdb debugging, is that it stops and waits the second time it reaches this line in luigi's worker.py.
Hope to find a way to figure out why.
@mjmonroe In the meanwhile, while trying to figure this out, I assume you know that parallel scheduling does not speed up the actual execution at all, only the scheduling phase? (As also discussed here).
Thus, this would be mostly relevant if you have extreme numbers of tasks.
In fact, we have ourselves never thought we needed to speed up the scheduling phase, even though we have hundreds of tasks sometimes.
But you might have different needs? :)
Your right, I do not think I will need to use parallel scheduling, only multiple workers. The thread really helps with the definitions of the two.
Closing this as a "won't fix" for now, unless someone finds a way to solve this easily, as for us, this is not a priority, and there seem to be some aspects of our approach, making this hard.
When I use a luigi.cfg file with parallel_scheduling turned on and run example1.py the log file shows:
Nothing else happens. It appears that there is a deadlock condition somewhere that prevents MyWorkflow from being scheduled. I have not been able to figure out what the issue is.
My luigi.cfg file: