Open tripleo1 opened 2 years ago
I'm not sure how much detail you want.
In the bxl
variant (xl2 subtree), the parse tree would be progressively rewritten using translate
statements, until it was lowered to a "bytecode" that would then be mapped to C code using the file xl2/native/library/runtime/C/xl.bytecode
(another similar file would generate Java code). The generated code would then be fed to a C compiler, with the help of xl_lib.h
to provide some basic support functions.
In the fast
branch, an early dispatch will instantiate various implementations of the Evaluator
class based on the optimization level. After that, parsing will be common to all implementation, sharing the same parse tree, and then passing it to the selected Evaluate
function.
The interpreter (-O0) will simply rewrite the parse tree progressively in order to evaluate the tree.
The bytecode evaluator (-O1) is a compiler to an intermediate "bytecode" representation which is designed for fast interpretation. I'm presently focusing on that implementation because tracking LLVM changes was taking most of my very limited development time. See below what the bytecode looks like. So there is a "real" basic compiler, but it's very incomplete at this point.
The "fast compiler" (-O2) is a port of the XL compiler as it existed in Tao3D. Unfortunately, there are a number of issues with it (starting with a very bad type system), and I'm not really looking at it.
The LLVM compiler (-O3) was attempting to use Haskell-style type inference and from there generate machine-level code. I've put that on hold after realizing how futile it was to try and track LLVM incompatible API respins. At this point, it supports LLVM up to 10 (I believe), but fails on "Fibonacci" on an assert regarding internal low-level types (Assertion failed: (!boxed[base] || boxed[base] == mtype), function AddBoxedType, file compiler-types.cpp, line 152
). Essentially, the type inference fails to correctly deduce consistent integer
types. If you are interested, I can explain how it's supposed to work, but for now, I'm not spending much time on this one.
Back to the bytecode one, if you run ./xl -O1 -B f.xl
with f.xl
containing:
fib 0 is 1
fib 1 is 1
fib N is (fib(N-1) + fib(N-2))
fib 6
then you should get the following bytecode output, slightly commented with //
:
// Compiles whole declaration scopes, here `builtins.xl`, as a `get_scope` that returns the scope
Compiled [X:natural + Y:natural as natur…as tree is native "xl_listen"] as
Locals:
L0 = [X:natural + Y:natural as natur…as tree is native "xl_listen"]
Opcodes (1 locals, 0 parameters, 1 temporaries)
0: get_scope L0
2: ret L0
// Compilation of several basic types in order to evaluate them during type check
Compiled [natural] is [natural] as
Constants: // The bytecode contains a table of constants
C0 = natural
Types: // This is the type information gathered during compilation
natural as name
natural as nil // This means "no type found" (a bug, I'm working on it)
Opcodes (0 locals, 0 parameters)
// A type is evaluated as a constant tree value with the type name (ensures unicity)
0: ret C0
Compiled [integer] is [integer] as
Constants:
C0 = integer
Types:
integer as name
integer as nil
Opcodes (0 locals, 0 parameters)
0: ret C0
Compiled [real] is [real] as
Constants:
C0 = real
Types:
real as name
real as nil
Opcodes (0 locals, 0 parameters)
0: ret C0
// Actual compilation of the `fib N` body
Compiled [fib N] is [(fib (N - 1) + fib (N - 2))] as
Constants: // Constant table like above
C0 = 1
C1 = 0
C2 = fib N
C3 = 2
C4 = fib (N - 1) + fib (N - 2)
Parameters:
A0 = N
Locals:
A0 = [N][N][N] // A0 is argument 0, here mapped to three different tree representations of N
L1 = [N - 1] // L1 is local 1, here mapped to computed value for N-1
L2 = [fib (N - 1)] // L2 is local 2, mapped to computed value for fib(N-1)
L3 = [N - 2]
L4 = [fib (N - 2)]
Types:
N as natural // Types as deduced
fib (N - 2) as real // This is bogus (work in progress), should have retained natural
fib (N - 1) as real
fib (N - 1) + fib (N - 2) as real
Opcodes (5 locals, 1 parameters, 4 temporaries)
0: sub_natural A0, C0, L1 // Here, we subtract C0=1 from N and result in L1 = N-1
4: check_natural L1, C1, +5=13 // We check if N-1 is C1=0, if not jump to 13
8: copy C0, L2 // For N-1=0, we copy constant C0=1 into L2 (fib N-1)
11: branch +14=27
13: check_natural L1, C0, +5=22 // We check if N-1 is C0=1, if not jump to 22
17: copy C0, L2 // For N-1=1, we copy C0=1 into L2 (fib N-1)
20: branch +5=27
22: call C2, (1 arguments) // Remaining case (not 0 or 1): recursive call of C2 = fib N
0: L1 // Pass argument L1 = N-1
=> L2 // Result in L2
27: natural_typecheck L2, +36=66 // Check that the result is a natural
30: sub_natural A0, C3, L3
34: check_natural L3, C1, +5=43
38: copy C0, L4
41: branch +14=57
43: check_natural L3, C0, +5=52
47: copy C0, L4
50: branch +5=57
52: call C2, (1 arguments)
0: L3
=> L4
57: natural_typecheck L4, +6=66
60: add_natural L2, L4, L5
64: branch +26=92
66: integer_typecheck L2, +9=78
69: integer_typecheck L4, +6=78
72: add_integer L2, L4, L5
76: branch +14=92
78: real_typecheck L2, +9=90
81: real_typecheck L4, +6=90
84: add_real L2, L4, L5
88: branch +2=92
90: form_error C4
92: ret L5
Compiled [fib 0 is 1|fib 1 is 1|fib N is… (N - 1) + fib (N - 2))|fib 6] as
Constants:
C0 = 6
C1 = fib N
Locals:
L0 = [fib 6][fib 0 is 1|fib 1 is 1|fib N is… (N - 1) + fib (N - 2))|fib 6]
Types:
6 as natural
Opcodes (1 locals, 0 parameters, 1 temporaries)
0: call C1, (1 arguments)
0: C0
=> L0
5: ret L0
followed by the computation result:
13
Hope this helps getting started.
Maybe if you have time you could explain how the compiler works (or is supposed to work), specifically for the two basic examples of factorial (or fibonacci) and hello world in bxl (which, if I understand correctly is abandoned because of typing issues) and your newer fast branch?