[IR] Automatically insert .cache calls

ggevay commented 8 years ago

We would like to insert .cache calls in three cases:

a val or var is referenced more than once
a var is being assigned
a val or var is being referenced inside a loop that doesn't contain the definition of this val or var

I'm thinking about whether to do this on DSCF or on control-flow primitives.

Trivial in both cases.
Trivial with control-flow primitives, also relatively easy with DSCF (we have to detect when it's being referenced in the argument list of a call to a def that is the body of a loop)
Seems easy with control-flow primitives: we create an inherited attribute that tells the innermost enclosing loop, and then an accumulated attribute builds a map that shows for each ValDef which is the innermost loop that contains it. And then when we see a ValRef, we check whether the current innermost enclosing loop is the same as the innermost enclosing loop of the definition (which we get from the map). However, on DSCF it gets messy: we don't see the loops, but we see some DefDefs that were created from loops. But these DefDefs cannot be directly used the same way as the loops in the control-flow primitives representation, because the suffixes of the loops are also inside these DefDefs, and we don't want to insert a .cache call when the DataBag is used only in the suffix.

ggevay commented 8 years ago

The easiest way that I can think of to do 3. on DSCF is to instead of maintaining just the innermost enclosing loop, we maintain the number of enclosing DefDefs that represent loops minus the number of enclosing suffix$ DefDefs. And then if this number is not the same for a ValRef as for its corresponding ValDef, then insert .cache. (I hope this works, but I'm not 100 sure.)

aalexandrov commented 8 years ago

I'll try to reformulate 3 with the help of some imaginary API for the sake of further discussion.

[...] we create an inherited attribute that tells the innermost enclosing loop, [...]

// returns the innermost enclosing loop (if `t` is inside a loop)
def enclLoop(t: u.Tree): Option[u.TermSymbol]

[...] and then an accumulated attribute builds a map that shows for each ValDef which is the innermost loop that contains it [...].

// for a symbol represending a Scala `u.ValDef` or `u.VarDef`,
// returns the innermost enclosing loop (if `s` is defined inside a loop)
def inLoop(s: u.TermSymbol): Option[u.TermSymbol]

And then when we see a ValRef, we check whether the current innermost enclosing loop is the same as the innermost enclosing loop of the definition (which we get from the map).

I think if we should match BindingRef here.

case t@BindingRef(s) 
  if inLoop(s) != enclLoop(s) && ... => 
  // ???

I'm not sure what to do in ??? other than store the information that s needs to be cached.

val x = /* DataBag term for symbol `s` */
// ...
f(x)

should be replaced with

val x = /* some DataBag term */
val y = x.cache()
// ...
f(y)

joroKr21 commented 7 years ago

Here is another suggestion. Suppose we work on DSCF. Keep track of the current owner. Then synthesize an attribute in the following way:

Match a ValRef with type of DataBag (we can think about parameters later, it's not much different);
Check if the current owner is different from the ValRef owner and is a loop method:
1. If it is, set a flag (e.g. accessedInLoop);
2. If it is not, increment a counter (e.g. refCount);
  - Thus the attribute has to be of type Map[TermSym, (Boolean, Int)].
In addition match a DefCall of a loop and set flag for all arguments (but not count - already done);
When transforming match on entire Let block to make sure you collected the counts;
Replace ValDefs that are accessedInLoop or have refCount > n with cached vals (reuse symbol here).

Pretty much done? The difference with parameters is following:

Assume parameters to loops are already cached at call sites.
For function and other parameters, insert cached val instead of replacing.
The last part means collect symbols to substitute (you can do it by accumulating top-down).

aalexandrov commented 7 years ago

@joroKr21 The algorithm makes sense but a few things have to be revised in order to cover some corner cases.

R1. The check in (2) will not match the following

val xs = ???
for (i <- 1 to N) {
  if (i % 2 == 0) f(xs)
  else g(xs)
}

R2. I don't see how the sketched transformation will identy xs to be cached in the original motivating example:

val xs = ???
if (???) f(xs)
else g(xs)

This will have refCount = 2 and will be cached. To catch such cases, conceptually we have to

track the number of uses of a TermSym per block;
compute sets of independent blocks;
compute different possible refCount sums picking one element of each independent block set;
insert cache if a refCount > 1 exists.

joroKr21 commented 7 years ago

Yes, indeed. The problem in R1 can be solved with this predicate (assuming we track the complete owner chain): owners.takeWhile(_ != ref.owner).exists(isLoop)

R2 poses an interesting question. If we generalize the refCount as a kind of score, how much is worth a ref within a branch? 0.5? Imagine that a ref can be nested in several layers of branching.

aalexandrov commented 7 years ago

Closed via #305.

emmalanguage / emma

[IR] Automatically insert .cache calls #281