Distill a small core language

countvajhula commented 2 years ago

Summary of Changes

This is WIP. The plan is to distill a small core language that would be amenable to characterization using the binding spec DSL, and so that compiler optimizations would be tractable.

This is work towards https://github.com/countvajhula/qi/issues/22 .

The present PR will likely not include any novel compiler optimizations, but it would be ideal to include implicit, restorative optimizations designed to map specific Qi -> Qi0 expansion rules back to their former Qi -> Racket versions, so that there is no performance penalty from merging this PR. That is, assuming there are N forms that are rewritten from their original Qi->Racket rules to Qi->Qi0 instead, then there are likely to be N optimization rules in this PR that will rewrite those specific Qi0 expansions back to the a priori Racket versions.

Public Domain Dedication

[x] In contributing, I relinquish any copyright claims on my contribution and freely release it into the public domain in the simple hope that it will provide value.

countvajhula commented 2 years ago

FYI @michaelballantyne @benknoble

countvajhula commented 2 years ago

Did some Wild West coding and reduced a number of forms to other forms. There are still a few that need to be reduced, and a few others that need to be reviewed (well, they all need to be reviewed of course!). This is mainly an initial stab and it doesn't have to be "right" yet -- just to have a provisionally identified Core Language for a meeting with @michaelballantyne on Friday to get a first sense of how, given such a core language, the binding spec DSL could help by automatically generating an expander from non-core forms to core forms, and to level set on compiler stuff.

countvajhula commented 2 years ago

An accounting of the current state of things (kept up to date):

Summary

28 + 6 = 34 Core forms 37 - 6 = 31 Non-core forms

Core forms

~>
==
-<
gen
_
NOT
XOR
and
or
collect
sep
select
block
group
if
sieve
try
feedback
<<
>>
loop
loop2
apply
clos
esc
partial applications
fine-grained partial application templates
blanket partial application templates

--- below not "true" core ---
not
><
pass
⏚
AND
OR

Non-core

count
one-of?
all
any
none
NOR
NAND
XNOR
and%
or%
any?
all?
none?
X
bundle
when
unless
switch
gate
input aliases (1> 2> ...)
live?
rectify
inverter
effect
literals
~>>
==*
partition
fanout
identifiers (which are assumed to just be functions - like sqr, add1, +, etc. - and are passed through for evaluation by Racket)
qi:-prefixed (backwards compatibility) macros

Can probably be reduced

sieve
loop
loop2

Other Notes

>< should probably reduce to loop rather than to ==
loop2 -- should this just be removed? It was originally added for possible better performance than loop, but it actually performs worse than loop.

countvajhula commented 2 years ago

I checked the current (unoptimized) performance difference from the a priori performance. Here are the results:

'(("OR" . "12.85")
  ("any?" . "10.51")
  ("none?" . "5.86")
  ("NOR" . "5.33")
  ("live?" . "5")
  ("rectify" . "3.87")
  ("all" . "3.63")
  ("AND" . "3.07")
  ("all?" . "2.86")
  ("any" . "2.62")
  ("none" . "2.5")
  ("NAND" . "2.5")
  ("loop" . "2.17")
  ("partition" . "1.66")
  ("tee" . 1)
  ("XNOR" . 1)
  ("when" . 1)
  ("collect" . 1)
  ("clos" . 1)
  ("block" . 1)
  ("crossover" . 1)
  ("catchall-template" . 1)
  ("unless" . 1)
  ("try" . 1)
  ("and" . 1)
  ("pass" . 1)
  ("esc" . 1)
  ("XOR" . 1)
  ("group" . 1)
  ("feedback" . 1)
  ("template" . 1)
  ("ground" . 1)
  ("apply" . 1)
  ("relay*" . 1)
  ("NOT" . 1)
  ("not" . 1)
  ("<<" . 1)
  (">>" . 1)
  ("switch" . 1)
  ("input aliases" . 1)
  ("select" . 1)
  ("fanout" . 1)
  ("thread" . 1)
  ("gen" . 1)
  ("and%" . 1)
  ("currying" . 1)
  ("or%" . 1)
  ("loop2" . 1)
  ("sieve" . 1)
  ("one-of?" . 1)
  ("or" . 1)
  ("if" . 1)
  ("effect" . 1)
  ("bundle" . 1)
  ("count" . 1)
  ("gate" . 1)
  ("thread-right" . 1)
  ("inverter" . 1)
  ("sep" . 1)
  ("relay" . 1)
  ("amp" . "0.48"))

The criteria for measurement are:

(~> (a b) / (if (< 0.5 _ 1.5)
                1
                (~r #:precision 2)))

That is, it prints "1" unless the ratio is outside the range half - 1.5X, to avoid noise.

Based on this data, it appears that a few forms are a lot slower, as expected, while amp has unexpectedly gotten a 2X performance boost!

countvajhula commented 2 years ago

I've added a proof-of-concept optimization just to complete the cycle and prove (mostly to myself) that the compiler can improve performance. This is a "restorative" optimization for all. I verified that, with this optimization, the performance of all is the same as the original performance so it has been "restored" as intended 🙂

Note that the optimized expression is not equivalent to the original expression in the presence of side-effects, as you pointed out earlier @benknoble (for a different form).

But, the more I think about it, towards arriving at our "theory of optimization", the more convinced I seem to become that Qi should indeed downgrade side-effect guarantees to specific forms like effect and esc, and otherwise consider the behavior "undefined". I feel that this would give Qi significant performance edge, making it possible to exceed Racket performance in possibly many common cases. And without this assumption, it seems that Qi would be somewhat hamstrung even in restoring to pre-compiler levels (I don't know this for sure, but just a gut feeling - but I get the feeling that @michaelballantyne has had this inkling as well).

Now, I realize that this proposition is somewhat drastic, in that a basic (~> my-side-effecting-function) would no longer receive first class treatment. But it would be easy enough in this case to do (~> (esc my-side-effecting-function)), or, perhaps even better, make us think a little more about why our function is written this way, and whether it would perhaps be better to write it so it could be (~> my-better-function (effect displayln my-decoupled-second-stage-function)) -- that is, encouraging writing smaller functions where side effects could be isolated more cleanly could lead to better outcomes more broadly and not just within a Qi context.

I also wonder whether side-effecting functions would perhaps be an uncommon case in practice, so that the use of esc in this way would not be common.

I'm not convinced of this yet, but that's what I'm thinking at the moment.

countvajhula commented 2 years ago

Hello! An update: I've extracted all of the non-core forms out of the core and defined them as macros (as we discussed last time @michaelballantyne ).

In doing this, I found it expedient to promote some formerly non-core forms to core forms. This was partially because some of them were a bit tangled and I didn't immediately see a way to detangle them (e.g. sieve and pass and amp I think). But it was also because, since we need to use exclusively core forms in the core, it would have been less clear in some cases to use the core expansions, e.g. rewriting AND and OR to folds wherever they were used seemed like it would obscure the meaning. But, I dunno, maybe that's exactly what we should be doing here. We should definitely revisit these.

I've also refactored the code to more clearly differentiate the core language from the extensions. This is the new module layout:

├── flow
│   ├── aux-syntax.rkt
│   ├── core
│   │   ├── compiler.rkt
│   │   ├── impl.rkt
│   │   └── syntax.rkt
│   └── extended
│       ├── expander.rkt
│       ├── forms.rkt
│       ├── impl.rkt
│       └── syntax.rkt
├── flow.rkt
├── info.rkt
├── macro.rkt
├── main.rkt
├── on.rkt
├── private
│   └── util.rkt
├── switch.rkt
└── threading.rkt

(generated using this command alias in my bashrc: alias lstt='clear; tree -LC 4', i.e. the tree utility. I also have this other alias that's handy fwiw: alias lst='clear; tree -LC 2' -- both inherited from my friend @hunan-rostomyan)

Overview: the top level contains the core Racket-level interface macros and any supporting utilities at that level. The flow folder contains the Qi language (i.e. not concerned with its embeddings into Racket). Within the flow folder, core contains the core language, including the compiler, and extended includes all of the non-core forms extracted from the core (now defined as Qi macros in extended/forms.rkt) and the (empty) expander.

The syntax.rkt file in each of core and extended contains syntax classes that help in parsing core and non-core syntax, respectively.

The impl.rkt file in each of core and extended contains the actual implementations (including helper code) for core and non-core forms, respectively.

aux-syntax.rkt contains "auxiliary" syntax classes used in both core and extended forms (maybe this can be eliminated upon review down the line).

private/util.rkt contains non-implementation-related utilities used anywhere.

The tests pass, but the docs don't build at the moment, and that's why CI is failing. I don't yet know what the failure is and haven't investigated, but it seems that identifiers provided from extended/forms.rkt (i.e. the new Qi macros) aren't visible in the docs for some reason.

See y'all tomorrow 😄

countvajhula commented 2 years ago

I've started a new integration branch to hold the Qi compiler work and allow each PR to be bite-sized as much as possible. This one can't be merged into the main branch yet (since it degrades performance by a lot), but I think we can consider it complete (without considering it in any way "decided" as far as the core language goes) as a bounded component of building the compiler, so I will merge this into the integration branch.

drym-org / qi