Consider the following graph of computational grains:
There are two grains hanging off of USA_FeedCrops_Price_18, but one of them has only a single node in it. In some cases, particularly if we were trying to parallelize a graph for MPP evaluation, the overhead associated with dispatching that grain might be larger than the savings we could hope to get from the added parallelism. In that case we would want to fuse that grain with its peer so that it would be dispatched along with the nodes in the peer. We actually do this to a limited extent in the current algorithm, but clearly we miss a few. What we need is a hard threshold where for any grain smaller than that we say, "If this grain has any siblings, fuse it with the smallest one; otherwise, fuse it with its parent."
Consider the following graph of computational grains:
There are two grains hanging off of
USA_FeedCrops_Price_18
, but one of them has only a single node in it. In some cases, particularly if we were trying to parallelize a graph for MPP evaluation, the overhead associated with dispatching that grain might be larger than the savings we could hope to get from the added parallelism. In that case we would want to fuse that grain with its peer so that it would be dispatched along with the nodes in the peer. We actually do this to a limited extent in the current algorithm, but clearly we miss a few. What we need is a hard threshold where for any grain smaller than that we say, "If this grain has any siblings, fuse it with the smallest one; otherwise, fuse it with its parent."