genericworkflownodes / GenericKnimeNodes

Base package for GenericKnimeNodes
https://github.com/genericworkflownodes/GenericKnimeNodes
Other
15 stars 16 forks source link

Support for parallel execution #185

Open timosachsenberg opened 6 years ago

timosachsenberg commented 6 years ago

Check if we can provide parallel zip nodes that are easily configured and potentially allow to limit (globally) the total number of jobs run in a workflow.

jpfeuffer commented 6 years ago

I think this is not possible with a single node. It needs to know how many branches are between loop start and loop end and how many threads each branch uses in its maximal needy node. So you have the following variables:

I have a metanode for this in my own workflows but the user still needs about the structure inside the node and use the right variables in the "-thread" parameters in the nodes.

timosachsenberg commented 6 years ago

yes I think we can't control the number of threads but maybe the number of jobs (as TOPPAS did)

timosachsenberg commented 6 years ago

It's cool that you have this meta node and certainly a good start. To make it really useful I think there should definitely be a way to get the number of branches and iterations etc. (except the number of threads which KNIME can't know) to schedule the processing.

jpfeuffer commented 6 years ago

Iterations is certainly possible. Branches might by hard/impossible to get, since it is workflow specific. But users should be easily able to type that into a variable.

timosachsenberg commented 6 years ago

How does KNIME do it for their loops? Hard to believe they manage to parallelize if they can't query this information from the KNIME API.

jpfeuffer commented 6 years ago

The Parallel For Loop just creates virtual duplicates (=branches) for whatever is between Loop Start and End. You can specify how many are created of course. But you have to do the calculations by yourself.

timosachsenberg commented 6 years ago

oh ok, thanks for clarifying.