Closed mattfel1 closed 5 years ago
Are cond
and cond2
loop invariant w.r.t p
in this example?
yes, forgot to show that. This is a case that old spatial got right somehow
Old spatial just did basic data dependency checking to check total loop invariance. Is the analyzed access pattern correct (should be random but with last invariant iterator of n
)? Sounds like the invariant iterator part is off if it's assigning a new symbol for each lane.
Yea that's right. The access pattern is not correct, it thinks the last invariant iter is p
.
For lastIters of the access pattern, it gets components.flatMap{prod => prod.syms.mapping{x => lastVariantIter(is,x) }}
. The components
are x## + x## (the two mux syms). is
is the accessIterators(mem, access)
, which is every iterator scoped between the declaration of the memory to the access. These x##'s all look like they vary with all the iterators, since scopes(iter)
is all of the nestedStms in a block regardless of whether that stm depends on that iterator or not.
I think I'm going to just edit this
val scope = block.nestedStms.toSet
...
scopes ++= is.map{_ -> scope}
instead of giving all of scope to every is
, I need to filter the scope against what stms actually depend/vary on that iter
I think I fixed it:
scopes ++= is.map{i => (i -> modeling.consumersDfs(i.consumers,Set(), scope)) }
and consumersDfs is
def consumersDfs(frontier: Set[Sym[_]], nodes: Set[Sym[_]], scope: Set[Sym[_]]): Set[Sym[_]] = frontier.flatMap{x: Sym[_] =>
if (scope.contains(x) && !nodes.contains(x)) {
consumersDfs(x.consumers, nodes + x, scope)
}
else nodes
}
Used in a few other places for retiming.
The test case it works for is
Foreach(10 by 1){i =>
Foreach(3 by 1, 3 by 1, 64 by 1 par 16){ (j,k,l) =>
val cond1 = i % 2 == 0
val cond2 = i % 3 == 0
y(j,k,l) = x(mux(cond1, j, j*2) + mux(cond2, k, k*2)) + z(j,k)
}
}
Definitely an issue, not 100% sure I have the correct explanation yet:
It should broadcast x to all 16 lanes, but when its doing the affine analysis, it assigns mux + mux as a new random iterator, and a different one for each lane. Then it thinks that it needs to duplicate the sram since it can't detect the invariance.
I think the fix is to just have a map from new iter to the real iters that compose it, and then .getOrElse(iter, iter) on this map when checking for invariance