Open aykevl opened 2 years ago
Another important issue is interp
, where we may need to rethink a bit.
@niaow yes. We currently run interp once per package and then again for the whole program. I imagine an initial implementation of this feature would be opt-in and only run interp per package (not for the whole program) which should work in practice with some increase in binary size. We can then look into improving this. I didn't include it in the list as it isn't a true blocker like most of the other items are.
ThinLTO is now supported on all architectures/platforms! :tada: That's one more checkbox checked.
The reflect refactor is in :tada:
Managed to run some test programs with a new -lto=thin
flag! See: https://github.com/tinygo-org/tinygo/pull/3489
The next hurdle is refactoring interface type asserts and interface method calls, which is something that will likely be necessary for full reflect support anyway (to implement things like .Method(n)
).
After #285, I'd like to move one step further: by compiling packages entirely separately and doing optimizations across packages using ThinLTO (or, optionally, full LTO if desired). The main benefit is that compilation should be a lot faster. Both with a cold cache (by parallelizing codegen) and with small changes to the source code (by reusing most packages). We should be able to get close to the speed of the
go
toolchain: TinyGo is currently a lot slower.How we currently compile packages is as follows:
What I'd like to see:
This means there is no phase in which all IR is combined into one big module, which avoids the serial step that currently takes up most of the compile time.
This is no small task. We currently rely heavily on merging all packages together to perform some (required) optimization passes. These will need to be changed in some way to work well with LTO, by modifying them or replacing them with something else:
AddGlobalsBitmap
pass to be able to scan global variables in the GC mark phase. It should be possible to convert this to simply scanning the.data
/.bss
sections everywhere (see #2867, #2869 for example).MakeGCStackSlots
pass. We need to make this pass run per package. In the future, the WebAssembly GC would be an alternative.LowerReflect
. I've been working on a replacement in #2640 but it's going to cost something. In return, the compiler itself becomes easier to understand and new reflect features are easier to add.LowerInterfaces
. We probably need to switch to vtable style interfaces. The optimizations that we currently do might be replaced by LLVM support for whole program devirtualization for C++.LowerInterrupts
. This is done late so that unused interrupts can be optimized away. I'm not sure how to do this efficiently in any other way other than at this stage.Of course, the resulting binaries should remain small. It's hard to avoid a slight increase, but hopefully the benefits of a simpler compiler and (much) faster compile times outweigh the downsides.