jkenda / aback

A stack-oriented language that uses Polish notation which can be reversed using the ; operator (previously |>)
0 stars 0 forks source link

QBE backend #5

Open jkenda opened 2 months ago

jkenda commented 2 months ago

Include QBE backend instead of the current custom one. This will bring pros like targeting multiple architectures (amd64, aarch64, riscv64) and optimization. But it will require rewriting the entire parser to produce an AST which will then be used to output QBE IR.

steps

stages

frontend

middle-end

backend

-- Backend is basically

qbe -o <name>.s <name>.ssa 
cc -o <name> <name>.s
rm <name>.ssa <name>.s

or

erlc <name>.erl

but called from OCaml.

jkenda commented 2 months ago

Building an Abstract Syntax Tree (AST) from prefix notation and then translating it into QBE Intermediate Representation (IR) involves a few structured steps. Let's summarize the process:

Building an AST from Prefix Notation

  1. Start with Prefix Notation: Prefix notation (or Polish Notation) places the operator before its operands. This straightforward, hierarchical structure lends itself well to recursive parsing techniques.

  2. Parse the Expression: Begin parsing the expression from the start. The first element you encounter will be an operator (since we're dealing with prefix notation). Determine the arity of the operator—i.e., how many operands it expects.

  3. Recursive Parsing:

    • For each operand of the current operator, determine if it is another operator or a terminal value (e.g., a number or variable).
    • If it's an operator, recursively apply this process to parse its operands.
    • If it's a terminal value, it becomes a leaf node in the AST.
  4. Build the AST:

    • Each operator becomes a node in the AST, with its operands as child nodes.
    • The recursion depth reflects the nesting of operations in the original expression.
    • The process continues until the entire expression is parsed into an AST, which represents the computation's hierarchical structure.

Translating the AST into QBE IR

  1. Traverse the AST: Perform a traversal of the AST. A post-order traversal (left child, right child, node) is effective for evaluating expressions as it ensures operands are processed before their operators.

  2. Generate SSA Form:

    • As you evaluate each node (starting from the leaves up), generate SSA (Static Single Assignment) form variables for each operation's result.
    • SSA form ensures each variable is assigned exactly once, making the IR suitable for optimizations.
  3. Map Operations to QBE IR:

    • For each node in the AST (representing an operation in your source language), determine the corresponding QBE operation.
    • Generate the QBE IR instructions for these operations, using the previously generated SSA variables for operands and results.
  4. Handle Control Flow and Function Calls:

    • For branching and function calls, use the AST structure to generate the appropriate QBE IR constructs, like conditional jumps and function call instructions.
    • Ensure that the control flow in the IR accurately reflects the logic implied by the AST.
  5. Finalize the IR:

    • Once the entire AST is translated, you'll have a complete set of QBE IR instructions representing your program.
    • This IR can then be further optimized and eventually used to generate executable code.

Considerations

This process leverages the structured nature of prefix notation and the hierarchical representation of an AST to systematically transform your high-level language into a form that's closer to machine code, suitable for execution or further compilation by a backend like QBE.

jkenda commented 2 months ago

Option to output to stdout

aback com in.ab - | qbe | gcc -nostdlib -o out -pipe -xassembler -
open Unix
open Postprocess

let () =
  let (pipe1_read, pipe1_write) = pipe () in
  let (pipe2_read, pipe2_write) = pipe () in

  match create_process "qbe" [| "qbe" |] pipe1_read pipe2_write stderr with
  | _ -> close pipe1_read;
         close pipe2_write;

  match create_process "gcc" [| "gcc"; "-nostdlib"; "-o"; "out"; "-pipe"; "-xassembler"; "-" |] pipe2_read stdout stderr with
  | _ -> close pipe2_read;

  let out_channel = out_channel_of_descr pipe1_write in
  (* Simulate some output that aback might generate *)
  generate_qbe_il out_channel;
  flush out_channel;
  close_out out_channel;  (* Close the output channel, which sends EOF to qbe *)

  let rec wait_for_children () =
    try
      let _ = wait () in
      wait_for_children ()
    with
    | Unix_error (Unix.ECHILD, _, _) -> ()  (* No more children *)
  in
  wait_for_children ()
jkenda commented 1 month ago

AST to SSA conversion:

For everything that's pushed to the stack, the generator has to create a new variable. We have to know the position of every variable on the stack. Keep a stack that has references to variables.

Everything is local to the function, of course.