Open eholk opened 12 years ago
So in systems (like Accelerate or Wavescript) that use metaprogramming, usually the object language lacks abstraction and is instead just a graph of kernels (processing streams or arrays). In that environment there is no way that adjacent operators can "hide" from a simple fusion optimization based on local pattern matching.
In Harlan, what allows an upstream kernel to hide from a downstream one? Being inside a lambda, right?
Is control flow analysis the best solution? Or just partial evaluation to get kernels next to each other?
I think the main thing that defeats optimization is having a let. We can inline obvious things like (kernel ((i (kernel ...)) ...)
, but if it's (let ((x (kernel ...))) (kernel ((i x)) ...))
, then it's harder. One problem is that some of our intermediate passes may insert lets that the programmer didn't right.
I think it also gets a little tricky if the results from one kernel are used in two separate kernels. Does it make more sense to run the computation twice, or to save the results and reuse those?
Partial evaluation is an interesting idea...
Hmm, but unlike lambdas let's don't hide data flow. A typical constant-prop pass or whatever would maintain an environment as it descends the AST, and the same should work here, right? In your example a simple lookup in the environment for 'x' should show what kernel it's bound to.
I think that's the basic idea for what we wanted to do. There's just the question of whether we always want to say inline if a variable is bound to a kernel. If the results of a kernel are reused many times, we probably don't (unless the kernel is super tiny), but if the kernel results are used only once it's probably almost certainly a win to inline.
It's very easy to defeat kernel fusion and other optimizations now. We're basically doing very local pattern matching. To do this right, we'll probably need some more global control or data flow analysis to decide when we can do the optimizations.