rplzzz / tgraph

Tools for finding parallelization opportunities in data flow graphs.
GNU Lesser General Public License v2.1
0 stars 0 forks source link

Allow user to set a threshold value for fusing leftover grains with peers #2

Open rplzzz opened 7 years ago

rplzzz commented 7 years ago

Consider the following graph of computational grains:

image

There are two grains hanging off of USA_FeedCrops_Price_18, but one of them has only a single node in it. In some cases, particularly if we were trying to parallelize a graph for MPP evaluation, the overhead associated with dispatching that grain might be larger than the savings we could hope to get from the added parallelism. In that case we would want to fuse that grain with its peer so that it would be dispatched along with the nodes in the peer. We actually do this to a limited extent in the current algorithm, but clearly we miss a few. What we need is a hard threshold where for any grain smaller than that we say, "If this grain has any siblings, fuse it with the smallest one; otherwise, fuse it with its parent."