jhclark / ducttape

A workflow management system for researchers who heart Unix.
http://jhclark.github.com/ducttape
Other
117 stars 14 forks source link

Graft globs can cause hanging #148

Open jhclark opened 11 years ago

jhclark commented 11 years ago
task few > x :: P=(BP1: 1..3) {}
task completes < in=$x@few[BP1:*] {}

task lots > x :: P=(BP1: 1..10) {}
#task works1 :: z=$P@lots[BP1:*] {}                                                                               
#task works2 < in=$x@lots[BP1:1] {}                                                                               
task intractable < in=$x@lots[BP1:*] {}

Graft globs such as that in "intractable" can cause ducttape to hang. This appears to be an efficiency issue since the graft glob on "completes" having only 3 branches completes quickly while "intractible" does not complete in any reasonable amount of time, perhaps due to some unexpected exponential blow up in the code. This operation should be O(n).

nschneid commented 11 years ago

My guess is that the problem is in VariableHandler.scala—there seems to be a 3-layer nesting of map operations, though I'm not too familiar with Scala or this codebase. @dowobeha, any insights?

dowobeha commented 11 years ago

Not sure. What lines are you looking at?

On Tuesday, February 12, 2013, nschneid wrote:

My guess is that the problem is in VariableHandler.scala—there seems to be a 3-layer nesting of map operations, though I'm not too familiar with Scala or this codebase. @dowobeha https://github.com/dowobeha, any insights?

— Reply to this email directly or view it on GitHubhttps://github.com/jhclark/ducttape/issues/148#issuecomment-13443247.

When a place gets crowded enough to require ID's, social collapse is not far away. It is time to go elsewhere. The best thing about space travel is that it made it possible to go elsewhere. -- R.A. Heinlein, "Time Enough For Love"

nschneid commented 11 years ago

104–125

dowobeha commented 11 years ago

That is where globs get handled, but I don't see anything obviously wrong. Most of VariableHandler was @jhclark's code that I just refactored into VariableHandler.

jhclark commented 11 years ago

The first step in tackling this problem is to determine if the slow-down occurs in 1) building the hypergraph (i.e. VariableHandler or WorkflowBuilder) or 2) in walking the hypergraph (i.e. something called by UnpackedDagWalker). If anyone has free cycles right now, this can probably be done by just enabling more logging in logging.properties. In the worst case, we might need 1-2 more logging statements in the code in ducttape.scala.

After this diagnosis, things become more complicated, but it would at least save a bit of time and motivate me to look into this sooner. :)

Anyone interested?

On Tue, Feb 12, 2013 at 10:22 AM, Lane Schwartz notifications@github.comwrote:

That is where globs get handled, but I don't see anything obviously wrong. Most of VariableHandler was @jhclark https://github.com/jhclark's code that I just refactored into VariableHandler.

— Reply to this email directly or view it on GitHubhttps://github.com/jhclark/ducttape/issues/148#issuecomment-13447242.