firedrakeproject / firedrake

Firedrake is an automated system for the portable solution of partial differential equations using the finite element method (FEM)
https://firedrakeproject.org
Other
506 stars 159 forks source link

Optimise caching for Slate optimisation passes #2265

Open sv2518 opened 2 years ago

sv2518 commented 2 years ago

I have recently introduced a few compiler passes in the Slate compiler. For every optimisation there is a new compiler pass (a new Memoizer). The original expression between the passes is the same but how different nodes are treated changes between the compiler passes. Also, the result of one optimisation pass is passed into the next, so there is some dependency between the passes. Nevertheless, I think there might be an opportunity to cache parts of the DAG of the expression in one optimisation pass and reuse it in another? I still need to think about an example where this would be possible, but my question is: would we have any means to do something like "caching across Memoizers"?

wence- commented 2 years ago

One question: have you measured to check if this is actually a time-critical part of the compiler?

This is actually, I think, quite a deep question which I mull about every so often. As long as the passes always act either "parent before child" or "child before parent" uniformly, then we can think of each application of the pass as a kind of generalised left or right fold over the datastructure. The functional programming crowd calls these "catamorphisms", the OO design pattern is effectively the visitor pattern (although we don't modify objects in place).

Here's a paper discussing these traversal patterns: https://www.microsoft.com/en-us/research/publication/scrap-your-boilerplate-a-practical-approach-to-generic-programming/

So the question might be, when is the composition of two catamorphisms legit? Here's some stackoverflow discussion: https://stackoverflow.com/questions/12103309/when-is-a-composition-of-catamorphisms-a-catamorphism

I've tried to see if there's discussion/models of the composition of these kinds of traversal patterns in the literature, but ran into a bit of a brick wall. It might be worth pursuing.

sv2518 commented 2 years ago

I guess it might be worth pursuing if this is in fact actually a time-critical part of the compiler, which I have not checked yet. This was just a thought I had today when I have introduced yet another optimisation pass in the Slate compiler. I think Slate expressions used to be not too deep in general, but the e.g. the expression for the nesting of Schur complements here https://github.com/firedrakeproject/firedrake/blob/51378759ac84047d088069bfabe0f81f4b14ac5e/firedrake/slate/static_condensation/hybridization.py#L374 is one which is comparatively bit, so walking through that expression multiple times might be significant. I can look into how expensive these passes are and do some measurements when I have got some time to spare (not this week).