Closed hansihe closed 5 years ago
I finally had a chance to re-read the Thorin paper again and comment here. I still need to spend some time studying your IR, but I did want to leave a few thoughts in the interim:
We do not have the same primop system as they have. While this system has quite a few advantages, it also introduces some unneeded complexity for a MVP. I propose we add this later if it is needed.
I think this is OK to leave for later, I need to compare to understand just how significant an effort it is to get there. But given the time we have left, I think it's a sane choice to make this compromise. We won't have to go that long without addressing it anyway.
Related to the above, the ir is still "basic block" based.
This is the one thing that I think is an issue. The premise of Thorin is that the IR is "blockless" and has no scopes, instead every value/definition is a node in a graph, and all uses of those values are edges in the graph. This is the foundation of how the various transformations/optimization passes are driven. Without the graph representation, one must use name binding, and thus scopes, and take care to ensure that every transformation preserves the correct relationships expressed by those bindings. The graph representation also avoids the need for nesting/lifting functions to manage scope (e.g. block floating/sinking in SSA terms). It's not clear to me how operating at the basic block level would correspond to Thorin's semantics, as it would seem to be essentially oriented around maintaining an SSA representation.
One of the key transformations that Thorin uses is partial inlining/outlining (what the Thorin paper calls lambda mangling, essentially a single pass that does elements of both inlining and outlining, enabling deeper optimizations), which in turn is the basis upon which CFF-convertible programs are able to be reduced to CFF form, which are in turn reducible to a control-flow graph without dynamically allocated closures.
For the MVP, I think we can skate by with what we have right now, but big picture, getting this part of the IR implemented is really important in my opinion. It enables a lot of the other ideas I had in mind, not just the really nice optimizations that fall out (like mutual recursion elimination, etc.); namely specialization/monomorphization, which are natural steps in transforming the Thorin IR.
We do not have a type system. While we definitely need type inference in the future, we can do a lot without it.
I still need to review your IR in detail, but I imagine we can at least introduce the concepts of zeroth-order values (literals/non-function terms), first-order functions, and higher-order functions. That's all that is really needed for an IR like Thorin, gives us a clear path for embellishment later with richer types when we are able to get that implemented, and is something we get with very minimal inference on top of the AST we get from the Erlang/Core frontends. That said, I agree we can get decently far without it initially.
I'll follow up with more notes once I've had a chance to read through the eir
source in earnest.
This is the one thing that I think is an issue. The premise of Thorin is that the IR is "blockless" and has no scopes, instead every value/definition is a node in a graph, and all uses of those values are edges in the graph. This is the foundation of how the various transformations/optimization passes are driven. Without the graph representation, one must use name binding, and thus scopes, and take care to ensure that every transformation preserves the correct relationships expressed by those bindings. The graph representation also avoids the need for nesting/lifting functions to manage scope (e.g. block floating/sinking in SSA terms). It's not clear to me how operating at the basic block level would correspond to Thorin's semantics, as it would seem to be essentially oriented around maintaining an SSA representation.
I probably should have explained more about what I meant here.
There would still be basic blocks in the sense that the smallest unit of execution is a list of operations, where entry is only allowed at the start, and only the last operation is allowed to perform control flow. The operations that would normally be PrimOps in thorin, and would simply be floating in the graph, would need to be statically scheduled for us (at least until we introduce a PrimOp system). This would be done as non-terminal entries in basic blocks. Examples of these operations could be constructing a term that can't fail, or performing a test that could be branched on in a terminal operation.
When I say basic block, I mean it in the classic SSA sense. There is no nesting involved. The control flow between basic blocks and references to values are still absolutely represented as a graph. There is no name binding or scoping involved here.
These changes would make lambda mangling fully representable in this IR, even if the semantics around the PrimOp system is different, since that is not actually a vital piece of performing that transformation.
I still need to review your IR in detail, but I imagine we can at least introduce the concepts of zeroth-order values (literals/non-function terms), first-order functions, and higher-order functions. That's all that is really needed for an IR like Thorin, gives us a clear path for embellishment later with richer types when we are able to get that implemented, and is something we get with very minimal inference on top of the AST we get from the Erlang/Core frontends. That said, I agree we can get decently far without it initially.
Fully agree we will need to add this eventually, but I don't believe this would be needed for the initial milestone. I think should probably be one of the first things we look at once we have the initial prototype up and running.
This actually requires a fairly small amount of changes, and we end up with a much more elegant representation. This even fixes the one thing about the current semantics that I find really bad, namely the fact that only one SSA write is valid for each branch when doing a function call.
Changes:
At this point doing stuff like inlining functions would be as easy as copying in the IR of the other function and changing the external call to an BB call, Then running lambda mangling/optimization passes to remove noise.
There are also a few notable differences from Thorin: