Open SquidDev opened 6 years ago
Unboxed tuples
Evil! We should do this in a separate IR with first-class support for multiple return/result instead of piling it on Core.
However, as Lua 5.1/5.2 operate on floating points we can't guarantee they are associative.
Thankfully our semantics don't depend on Lua.
This is a collection of improvements we could make to the core IR or tools which deal with the IR. Some of these may be totally stupid, but some may be worth pursuing.
Join points
One of the problems with the current IR, is that any match expression in a non-tail position is just bound to a normal let. This reduces our ability to do optimisations like match-of-match. I propose adding join points (or continuations, or basic blocks) to the IR. Namely,
gets lowered to:
Join points act identically to normal lambdas, but with several restrictions:
CotJoinApp
term: unlike normal lambdas, join points are not physical values and so should not be used as one.This ensures we always know what join point we are calling, and all join points know where they are called: this should hopefully allow for additional optimisations.
Unboxed tuples
The issue with the current backend is that it generates a fresh closure for every argument. This is fantastic for currying, but less optimal for efficiency of the generated code. Ideally we could compile
fun x y z -> ...
intofunction(x, y, z) ... end
most of the time.It may be possible to get the backend to perform this optimisation, but that seems rather flaky (and will not work well with higher-order functions). Instead I propose adding unboxed tuples to the core. This would reduce
fun x y z -> ...
intofun a -> match a with | (# x, y, z #) -> ...
. Passes which do this optimisation may chose to only lower a subset of the arguments (say if a function is often partially applied likecompose
).It may also be possible to lower the type arguments to higher order functions (such as
foldr
, which will never curry the function). This'll be much harder to implement though. Something which would be really fancy would be converting functions which operate on records to operate on unboxed tuples instead.Various optimisations
[x] Improved inlining: The current inliner is a little naive at times: as each application is processed separately, we end up introducing lots of junk terms. Further more, we may end up inlining a non-saturated function call, which may increase the size of the call. Some improvements we could make:
compose
,id
, etc...) can always be inlined.[ ] Lambda lifting and lowering: This can actually apply to any pure term (literals, constructors, etc...) but this has more of a ring to it. Any term which is only used in one match arm, may be "pushed down" into that arm. Similarly, any term which does not depend on the parent argument may be "pushed up" into a higher level.
If possible, we should try to avoid breaking "lambda boundaries". Namely, we shouldn't really convert
f x y -> let a = { x = x } in ...
intof x -> let a = { x = x } in y -> ...
unless we know it won't prevent other optimisations.I think lambda lifting also allows us to do loop invariant code motion "for free". That being said, there may be times which will require us to generate trampolines for it to be effective. Namely reduce
let f x y -> ... f x' y
intolet f x y = (* lifted code *) let f' x' = ... in f' x
.[x] Common subexpression elimination: This is pretty self explanatory. The hard thing to do here will be determining when we should do it: you don't want to do CSE if the terms are really far apart from each other, as that just wastes locals.
Backend improvements
[ ] Generate loops: Loop detection will be immensely useful for the optimiser, but more importantly we'll need to convert tail-recursive functions to use loops. Like Urn, we'll need to be careful we don't capture variables which are mutated by the loop iterator.
[ ] Tail recursion modulo cons:
this can be compiled to something like:
Obviously this requires some non-trivial code generation (and duplicates a reasonable amount of code), so it's something we're going to have to think about. It would be amazing to have, as it makes the
map
implementation much nicer, but it doesn't result in the nicest code.