Restructure how functions are handled in the IR and codegen to eliminate function inlining

emilyaherbert commented 2 years ago

Once #1824 goes in, we will have the option to support recursive functions. But, in order to do so, we will need to make changes to the IR and codegen.

Blocked by:

[x] #1824

The IR would like to organise a program into a tree. Instructions gathered into blocks, gathered into functions, gathered into modules. So having distinct functions and all their meta available to IR would be beneficial structurally and analytically.

To the best of my knowledge, the debugger is mostly interested in attributing spans to instructions, but being able to resolve full paths of symbols to instructions is important to set breakpoints (or inspect memory values). Without the absolute naming of functions this becomes impossible. @Dentosal knows more.

Originally posted by @otrho in https://github.com/FuelLabs/sway/issues/1557#issuecomment-1127188087

otrho commented 2 years ago

The hardest part of removing inlining (and supporting recursion) is supporting actual function calls. I can't find an issue for this though at the moment.

Function calls are typically done with a call/ret mechanism, where ret is essentially a 'pop an arbitrary value off the stack and jump to it'.

The Fuel VM doesn't support arbitrary jumps so to get around this the current proposal is to use a 'jump table'. Each 'call' would include a token describing where the function should 'return' to which would be looked up in a static local table.

This may end up less efficient as popular functions would have a very large return table. But it could also prove more efficient in that CPS call chains could be generated and tail call optimisations are pretty much built in.

Another downside to using a return table is that library code can never be precompiled and included verbatim (i.e., linked) with user code as the return table for a library function depends on where it is called from, which varies per program of course.

mohammadfawaz commented 2 years ago

Just jotting down some things here to make sure I understand the full picture correctly:

Inlining has been used in our conversations to mean two different things:

Inlining the function body in the FunctionApplication AST node itself. This is not what the term "inlining" is usually used for I think.
Inlining in IR (inliner pass) where the function body is actually moved and replaces the call instruction in IR. (call here is not a contract call... the contract call instruction in IR is contract_call and, of course, no inlining happens there).

We want to avoid (1) by creating a declaration engine that avoids making copies of the function body everywhere it is called. This would be a cleaner and lighter design and enables a much easier solution for the trait constraints problem.

We also want to be more flexible with (2) and not have to inline everything (inlining in the traditional sense). Some functions will still be worth inlining (some heuristic can be introduced here) and we should probably have an #[always_inline] annotation at some point that a user can apply to force inlining. Introducing this flexibility requires all the stuff that Toby is talking about above.

Am I getting this right?

FuelLabs / sway

Restructure how functions are handled in the IR and codegen to eliminate function inlining #1823