lemmih / lhc

The LLVM LHC Haskell Optimization System
http://lhc-compiler.blogspot.com/
The Unlicense
198 stars 16 forks source link

Use equality saturation for optimization #9

Open taktoa opened 6 years ago

taktoa commented 6 years ago

I'm still working on it, but my senior thesis is on a language-generic performance-oriented Haskell implementation of equality saturation. Since I suspect not a huge amount of work has gone into writing optimizations in this compiler, it may be worthwhile to look into equality saturation, since it is a radically different way of structuring an optimizer. Right now I'm busy with finals, but I can expand more on this in about 10 days if you're interested.

lemmih commented 6 years ago

Very cool! I'm definitely interested. I'll poke you in 10 days. Good luck with finals.

csabahruska commented 6 years ago

Looks interesting. Is this useful in functional setting? The axiom part looks ad-hoc to me in contrast to constructive mathematics/dependent type systems.

taktoa commented 6 years ago

It's basically optimization via saturation-based automated theorem proving in the first-order syntactic theory of equality. It's absolutely relevant in the functional setting; in fact, the only reason it works for imperative programs is because the authors invented a way to compile an imperative control flow graph into a referentially transparent AST using token passing with linear types. The only requirements to make it work are:

  1. A referentially transparent (i.e.: semantics of a node depend only on contents of node and semantics of children) graph representation of your code. The graph can have cycles.
  2. A term-rewriting system representing the set of allowable "basic" optimizations (which the engine will then combine to make more complicated optimizations; if this TRS were just the operational semantics of the language then you'd basically have a partial evaluator).
  3. A heuristic for the performance of a given piece of code. Since the code is referentially transparent, this will look like a function of type (node, [R]) -> R (where R is an ordered semiring like Rational or Int or SInt from sbv if you want to handle cycles) which gets turned into a catamorphism over the program expression graph.

Equality saturation (at least the way I'm implementing it) is an anytime algorithm; it will return better and better versions of the code as time goes on but can return a correct answer at any point during execution.

I'm quite sure the authors are aware of constructive mathematics; see their work on generating compiler optimizations from proofs, which is described as an instance of a more general category-theoretic algorithm/theorem.

taktoa commented 6 years ago

@csabahruska also related; since congruence closure is a form of saturation-based automated theorem proving, you may be able to use the techniques described in Congruence Closure in Intensional Type Theory when implementing equality saturation for optimizing an core IR that is dependently typed. I think for it to be efficient you might have to figure out some way of efficiently "algebrizing" dependent type theory, though (SK calculus is the classic example, but encoding lambda terms as SK terms causes the size to blow up, so you'd want something like Conal Elliott's lambda-ccc work for any situation where you're optimizing lambda calculus with equality saturation).

One thing that may not be clear from the things I've been saying is that equality saturation is effectively a form of superoptimization, except instead of enumerating programs as sequences of instructions and testing if they are equivalent to the original program, you are enumerating all programs known to be equivalent to the original program under the reflexive-transitive closure of a (hopefully confluent!) term rewriting system. These two approaches actually work well together; you could run a stochastic superoptimizer like STOKE on some fraction of the code you generate, decompile the optimized instructions into your IR, and add a new equality axiom mapping from the input IR to the decompiled IR. The pattern matching algorithm (Rete / term indexing via discrimination trees) used by equality saturation scales quite well with the number of rules; if the pattern of a rewrite rule overlaps with the pattern of another rewrite rule, the computation will be shared between them.

Another interesting point is that equality saturation is a pretty good fit for a JITed language, since you could run the engine in parallel with a hot piece of code, occasionally swapping it out when your performance heuristic increases enough to make it worthwhile.

csabahruska commented 6 years ago

Thanks for the references. Looks interesting. I see some similarity between this approach and interaction networks. For superoptimization I believe that abstract interpretation can drive the optimizer efficiently, revealing even the coincidental transformation cases which could not be generalized.

EDIT: I mean to use abstract interpretation iteratively to give hints to the optimizer engine where to transform. Of course this is still an eager algorithm but with global view. It would be interesting to see a comparison for the two approach.

taktoa commented 6 years ago

@csabahruska I'm not an expert on interaction networks, but from what I have read I suspect they are a good example of a referentially transparent IR that could be used in equality saturation. One correction to what I said earlier is that since you can have loops in the program expression graph, "algebrizing" the IR you want to optimize is a sufficient but not necessary condition for using equality saturation. It is however true that you can't quite have anything like variable binding; so you're limited to things like string diagrams and algebraic expressions.

I'm familiar with abstract interpretation, but I must confess that I'm not sure how you can get proofs of program equality from it; AFAIK the whole point of abstract interpretation over symbolic execution is that widening makes the analysis total at the cost of loss of precision. Since the proof of program equality would have to throw away any results of abstract interpretation in which widening was used, it seems to me that symbolic execution is more relevant. If you had meant symbolic execution, then I'd like to point out that STOKE works by symbolically executing a sequence of instruction to produce a first-order formula, and then recursively enumerates instruction sequences until it finds the shortest sequence that has a first-order formula (under symbolic execution) equivalent to the original instruction sequence's formula (and this is checked with Z3).

EDIT: also worth noting is that there is a rich literature on compiling functional languages to "combinators", beyond the more recent stuff like interaction networks and Conal's work.

csabahruska commented 6 years ago

I've started to read Ross Tate's phd thesis. It seems that the E-PEG intermediate representation is the key thing to use equality saturation based optimization. It seems that equality saturation based optimization approach requires E-PEG to be used as the compiler's IR. But LHC's IR is based on GRIN. Would not this change require a complete rewrite?

taktoa commented 6 years ago

No, you can just convert GRIN to a PEG, then convert back. Equality saturation can be a compiler pass (though ideally it should be the only compiler pass, otherwise it's kind of defeating the purpose).

csabahruska commented 6 years ago

I thought that E-PEG + the heuristic replaces the whole optimisation framework. If PEG is a compiler pass which does all the optimizations what would the other passes do?

lemmih commented 6 years ago

I'm actually more interested in using equality saturation with Core rather than GRIN. Although it is applicable to both languages, Core should be way easier to convert to a PEG and the potential benefits seem greater.

taktoa commented 6 years ago

@csabahruska

If PEG is a compiler pass which does all the optimizations what would the other passes do?

(though ideally it should be the only compiler pass, otherwise it's kind of defeating the purpose)

@Lemmih

Core definitely seems more applicable to eqsat; agreed.

lemmih commented 6 years ago

How far along is your Haskell implementation? Are there examples I can play around with?

taktoa commented 6 years ago

It still needs quite a bit of work, but the hard part (for me at least) is over; I now completely understand the algorithm and what needs to be implemented.

taktoa commented 6 years ago

If I were you I'd just work on the Core -> PEG and PEG -> Core conversions. Take a look at the EqSat.Term module; I haven't pinned down the PEG type yet but there will definitely be interconversions with GTerm.

lemmih commented 6 years ago

Hi @taktoa. Hope finals went well. Can you tell me more about your eqsat library? What are your plans and time horizon for the project?

taktoa commented 6 years ago

I'd say that I plan to hack on eqsat at least over the next 2 months while I'm travelling in Europe, and probably after that as well. My intention is to implement most of the features described in the thesis.