Open jkenda opened 2 months ago
Building an Abstract Syntax Tree (AST) from prefix notation and then translating it into QBE Intermediate Representation (IR) involves a few structured steps. Let's summarize the process:
Start with Prefix Notation: Prefix notation (or Polish Notation) places the operator before its operands. This straightforward, hierarchical structure lends itself well to recursive parsing techniques.
Parse the Expression: Begin parsing the expression from the start. The first element you encounter will be an operator (since we're dealing with prefix notation). Determine the arity of the operator—i.e., how many operands it expects.
Recursive Parsing:
Build the AST:
Traverse the AST: Perform a traversal of the AST. A post-order traversal (left child, right child, node) is effective for evaluating expressions as it ensures operands are processed before their operators.
Generate SSA Form:
Map Operations to QBE IR:
Handle Control Flow and Function Calls:
Finalize the IR:
This process leverages the structured nature of prefix notation and the hierarchical representation of an AST to systematically transform your high-level language into a form that's closer to machine code, suitable for execution or further compilation by a backend like QBE.
Option to output to stdout
aback com in.ab - | qbe | gcc -nostdlib -o out -pipe -xassembler -
open Unix
open Postprocess
let () =
let (pipe1_read, pipe1_write) = pipe () in
let (pipe2_read, pipe2_write) = pipe () in
match create_process "qbe" [| "qbe" |] pipe1_read pipe2_write stderr with
| _ -> close pipe1_read;
close pipe2_write;
match create_process "gcc" [| "gcc"; "-nostdlib"; "-o"; "out"; "-pipe"; "-xassembler"; "-" |] pipe2_read stdout stderr with
| _ -> close pipe2_read;
let out_channel = out_channel_of_descr pipe1_write in
(* Simulate some output that aback might generate *)
generate_qbe_il out_channel;
flush out_channel;
close_out out_channel; (* Close the output channel, which sends EOF to qbe *)
let rec wait_for_children () =
try
let _ = wait () in
wait_for_children ()
with
| Unix_error (Unix.ECHILD, _, _) -> () (* No more children *)
in
wait_for_children ()
For everything that's pushed to the stack, the generator has to create a new variable. We have to know the position of every variable on the stack. Keep a stack that has references to variables.
Everything is local to the function, of course.
Include QBE backend instead of the current custom one. This will bring pros like targeting multiple architectures (amd64, aarch64, riscv64) and optimization. But it will require rewriting the entire parser to produce an AST which will then be used to output QBE IR.
steps
stages
frontend
middle-end
backend
-- Backend is basically
or
but called from OCaml.