drym-org / qi

An embeddable flow-oriented language.
59 stars 12 forks source link

Qi Compiler #22

Open countvajhula opened 2 years ago

countvajhula commented 2 years ago

Once a reliable baseline is available for Qi's performance, we will want to write a Qi-specific compiler that operates on the generated Qi code in the expansion step, to produce optimized Racket expansions.

Overview

Qi will soon have first class macro-extensibility, which means that Qi macros written by users would be indistinguishable from built-in Qi forms, allowing extension in arbitrary ways. At that stage, many of Qi's current "core" forms could themselves be implemented internally as such macros, which would expand into a small core language yet to be distilled.

In tandem with that effort, we also will want to write a compiler that will enter the lifecycle after the macro expansion phase and expand the resulting Qi forms into optimized Racket code for improved performance.

Details

Define Qi* to be the full Qi language including both the core forms as well as any macros, and let Qi0 be a "core" Qi language, representing the end state of Qi macro expansion. Then, the full lifecycle of a Qi expression before it is evaluated is:

Qi* → Qi0 [Qi macro expansion phase]
Qi0 → Racket [Qi compilation]

... followed by the usual Racket lifecycle which it mirrors:

Racket → Fully Expanded Racket [Racket macro expansion phase]
Fully Expanded Racket → Racket bytecode [Racket compilation]

More: Fully Expanded Racket Racket bytecode

Qi Macro Expansion Phase

Each individual macro expansion rule is a syntax transformation Qi* → Qi*. The macro expansion phase as a whole (i.e. with repeated application of transformation rules) is a transformation χ:

χ : Qi* → Qi0

That is, the deliverable for this phase is Core Qi.

Qi Compilation

Each individual compilation pass is a syntax transformation IR_i → IR_i+1 for some intermediate languages IR_i and IR_i+1. The initial input to this pipeline is Qi0 (i.e. IR_0 = Qi0), and the final output is Racket (i.e. IR_n = Racket), so that the compilation phase as a whole is a transformation ξ:

ξ : Qi0 → Racket

That is, the deliverable for this phase is Racket.

As long as the process starts with Qi0 and delivers Racket, there are no constraints on what the intermediate languages IR_i could be. For instance, they could all be Qi0 (except the last), so that the rules are mainly nonlocal optimizations mapping combinations of Qi expressions into optimized versions. Or only some of them could be Qi0 (e.g. the early stages, consisting of nonlocal optimizations), or it could even be that IR_1 ... IR_n-1 don't resemble either Qi or Racket at all -- the point being just that there are no constraints here.

Implementation

Inserting the Compiler into the Lifecycle

Michael Ballantyne @michaelballantyne writes:

"Here's an example of how the compiler in flow might be transformed into a compile-time function; I'll just show one case for all:

(begin-for-syntax
  (define (qi0->racket stx)
    (syntax-parse stx
      [(_ ((~datum all) onex:clause))
       #`(give (curry andmap #,(compile-flow #'onex)))]
      ...)))

The overall compiler would compose the code generator with other passes:

(begin-for-syntax
  (define (compile-flow stx)
    (compose qi0->racket optimize-flow)))

Finally, the new flow macro would compose a macro expander for Qi with the compiler:

(define-syntax-parser flow
  [(_ flow-exp)
   ((compose compile-flow expand-flow) #'flow-exp))])

So all that's left is to write the macro expander expand-flow."

Facilitating the Macro Expansion Phase

Michael Ballantyne writes:

"The binding spec DSL [...] implements the macro expander part. If you wrote it by hand, it would look something like this, again just showing the case for all:

(begin-for-syntax
  (define (expand-flow stx)
    (syntax-parse stx
      [((~datum all) onex:clause)
       #`(all #,(expand-flow #'one-x))]
      ...)))

plus a macro expansion clause like the one you've added to flow right now.

The clauses for all the various Qi0 forms each follow the same pattern: match the syntax, recur on subexpressions, and reconstruct syntax. The declarative DSL implements this more concisely. (And when your language has binding forms, handles scope and binding too.)"

(For reference, the binding spec DSL.)

Strategies