Open jdonszelmann opened 9 months ago
Here's my take on the compiler pipeline. After consideration for potential features like metaprogramming, I've come up with a simple, fairly parallelizable structure to the compiler, relying on inter-item dependencies.
The core of the compiler is a work-stealing multi-threaded scheduler. It is assigned tasks, which may depend upon one another, and it is responsible for allocating them to threads and evaluating them as soon as possible. The results of various tasks may be cached, even across compilations.
The following tasks are defined:
Crate compilation.
Cargo.toml
equivalent) is loaded.Module compilation.
Function compilation.
Constant compilation.
Type compilation.
Function IR resolution.
Module type resolution.
Function type resolution.
Constant type resolution.
Type resolution.
Module API resolution.
The scheduler effectively maintains a directed acyclic graph of tasks. A cycle between tasks indicates a circular dependency that would prevent compilation; this is a recoverable error as other tasks can still be evaluated.
Not all the dependencies of a task can be resolved immediately; name resolution has to be accomplished first. But all that is required is delegation to another module; resolving a::b::c
, for instance, requires resolving the API of a
using local information, then launching a task for resolving b::c
within it. Doing this for every reference in the item will yield the complete list of dependencies.
To process tasks in parallel, the scheduler simply has to select tasks which are "far apart" from one another (i.e. only their furthest descendants are shared); this minimizes inter-process communication. It may be ideal for each compiler thread to have its own graph of tasks, which are occasionally synchronized or stolen from by others.
Performance analysis will rely heavily on inspecting the scheduler; it must offer a way to extract things like flame graphs.
After discussion with @tertsdiepraam, we've decided to move the majority of the name resolution to module API resolution, rather than item resolution. We can then use stack graphs to merge and cache name resolution between files. This requires a bit of care, because things like associated types and struct fields cannot necessarily be resolved immediately (due to meta-programming); some degree of lazy resolution is required. But this can be tracked in the stack graphs structure without additional work. I think we need to investigate exactly how much data the stacked graphs structure can hold for us.
@tertsdiepraam and I discussed this for a while, looking at pipelines of among others rustc and roc, and thought this is the best way forward. We can parse two ways: with a very complicated parser that keeps comments around into a CST, or quickly into an AST. From the CST we can go back to the source, which is nice for autoformatting. The CST can trivially be turned into an AST by removing the comments etc.
Quickly, we go from an AST to a DST - the desugared syntax tree. At this stage, constructs like
if let
(if we have it) can become match, for loops become while loops, etc making typechecking easier. That's what rustc also does.Then comes typechecking to a typed syntax tree, which is flattened into an intermediate representation (IR). On that we can do passes to optimise and do things like dead code elimination. The IR is just basic blocks and jumps, to make the translation to LLVM easier in the end. We can also, at that point, first write an interpreter of IR making testing easy.
The fuzzing story is as follows. We can prettyprint the AST. This does remove comments etc, but we can do a roundtrip to it. We can replace any sequence of spaces in the input string and the
input --> parsed --> prettyprinted
string with a single space and see if these are equal.