Closed countvajhula closed 2 years ago
FYI @michaelballantyne @benknoble
Did some Wild West coding and reduced a number of forms to other forms. There are still a few that need to be reduced, and a few others that need to be reviewed (well, they all need to be reviewed of course!). This is mainly an initial stab and it doesn't have to be "right" yet -- just to have a provisionally identified Core Language for a meeting with @michaelballantyne on Friday to get a first sense of how, given such a core language, the binding spec DSL could help by automatically generating an expander from non-core forms to core forms, and to level set on compiler stuff.
An accounting of the current state of things (kept up to date):
28 + 6 = 34 Core forms 37 - 6 = 31 Non-core forms
~>
==
-<
gen
_
NOT
XOR
and
or
collect
sep
select
block
group
if
sieve
try
feedback
<<
>>
loop
loop2
apply
clos
esc
partial applications
fine-grained partial application templates
blanket partial application templates
--- below not "true" core ---
not
><
pass
⏚
AND
OR
count
one-of?
all
any
none
NOR
NAND
XNOR
and%
or%
any?
all?
none?
X
bundle
when
unless
switch
gate
input aliases (1> 2> ...)
live?
rectify
inverter
effect
literals
~>>
==*
partition
fanout
identifiers (which are assumed to just be functions - like sqr, add1, +, etc. - and are passed through for evaluation by Racket)
qi:-prefixed (backwards compatibility) macros
sieve
loop
loop2
><
should probably reduce to loop
rather than to ==
loop2
-- should this just be removed? It was originally added for possible better performance than loop
, but it actually performs worse than loop
.I checked the current (unoptimized) performance difference from the a priori performance. Here are the results:
'(("OR" . "12.85")
("any?" . "10.51")
("none?" . "5.86")
("NOR" . "5.33")
("live?" . "5")
("rectify" . "3.87")
("all" . "3.63")
("AND" . "3.07")
("all?" . "2.86")
("any" . "2.62")
("none" . "2.5")
("NAND" . "2.5")
("loop" . "2.17")
("partition" . "1.66")
("tee" . 1)
("XNOR" . 1)
("when" . 1)
("collect" . 1)
("clos" . 1)
("block" . 1)
("crossover" . 1)
("catchall-template" . 1)
("unless" . 1)
("try" . 1)
("and" . 1)
("pass" . 1)
("esc" . 1)
("XOR" . 1)
("group" . 1)
("feedback" . 1)
("template" . 1)
("ground" . 1)
("apply" . 1)
("relay*" . 1)
("NOT" . 1)
("not" . 1)
("<<" . 1)
(">>" . 1)
("switch" . 1)
("input aliases" . 1)
("select" . 1)
("fanout" . 1)
("thread" . 1)
("gen" . 1)
("and%" . 1)
("currying" . 1)
("or%" . 1)
("loop2" . 1)
("sieve" . 1)
("one-of?" . 1)
("or" . 1)
("if" . 1)
("effect" . 1)
("bundle" . 1)
("count" . 1)
("gate" . 1)
("thread-right" . 1)
("inverter" . 1)
("sep" . 1)
("relay" . 1)
("amp" . "0.48"))
The criteria for measurement are:
(~> (a b) / (if (< 0.5 _ 1.5)
1
(~r #:precision 2)))
That is, it prints "1" unless the ratio is outside the range half - 1.5X, to avoid noise.
Based on this data, it appears that a few forms are a lot slower, as expected, while amp
has unexpectedly gotten a 2X performance boost!
I've added a proof-of-concept optimization just to complete the cycle and prove (mostly to myself) that the compiler can improve performance. This is a "restorative" optimization for all
. I verified that, with this optimization, the performance of all
is the same as the original performance so it has been "restored" as intended 🙂
Note that the optimized expression is not equivalent to the original expression in the presence of side-effects, as you pointed out earlier @benknoble (for a different form).
But, the more I think about it, towards arriving at our "theory of optimization", the more convinced I seem to become that Qi should indeed downgrade side-effect guarantees to specific forms like effect
and esc
, and otherwise consider the behavior "undefined". I feel that this would give Qi significant performance edge, making it possible to exceed Racket performance in possibly many common cases. And without this assumption, it seems that Qi would be somewhat hamstrung even in restoring to pre-compiler levels (I don't know this for sure, but just a gut feeling - but I get the feeling that @michaelballantyne has had this inkling as well).
Now, I realize that this proposition is somewhat drastic, in that a basic (~> my-side-effecting-function)
would no longer receive first class treatment. But it would be easy enough in this case to do (~> (esc my-side-effecting-function))
, or, perhaps even better, make us think a little more about why our function is written this way, and whether it would perhaps be better to write it so it could be (~> my-better-function (effect displayln my-decoupled-second-stage-function))
-- that is, encouraging writing smaller functions where side effects could be isolated more cleanly could lead to better outcomes more broadly and not just within a Qi context.
I also wonder whether side-effecting functions would perhaps be an uncommon case in practice, so that the use of esc
in this way would not be common.
I'm not convinced of this yet, but that's what I'm thinking at the moment.
Hello! An update: I've extracted all of the non-core forms out of the core and defined them as macros (as we discussed last time @michaelballantyne ).
In doing this, I found it expedient to promote some formerly non-core forms to core forms. This was partially because some of them were a bit tangled and I didn't immediately see a way to detangle them (e.g. sieve
and pass
and amp
I think). But it was also because, since we need to use exclusively core forms in the core, it would have been less clear in some cases to use the core expansions, e.g. rewriting AND
and OR
to folds wherever they were used seemed like it would obscure the meaning. But, I dunno, maybe that's exactly what we should be doing here. We should definitely revisit these.
I've also refactored the code to more clearly differentiate the core language from the extensions. This is the new module layout:
├── flow
│ ├── aux-syntax.rkt
│ ├── core
│ │ ├── compiler.rkt
│ │ ├── impl.rkt
│ │ └── syntax.rkt
│ └── extended
│ ├── expander.rkt
│ ├── forms.rkt
│ ├── impl.rkt
│ └── syntax.rkt
├── flow.rkt
├── info.rkt
├── macro.rkt
├── main.rkt
├── on.rkt
├── private
│ └── util.rkt
├── switch.rkt
└── threading.rkt
(generated using this command alias in my bashrc: alias lstt='clear; tree -LC 4'
, i.e. the tree utility. I also have this other alias that's handy fwiw: alias lst='clear; tree -LC 2'
-- both inherited from my friend @hunan-rostomyan)
Overview: the top level contains the core Racket-level interface macros and any supporting utilities at that level. The flow
folder contains the Qi language (i.e. not concerned with its embeddings into Racket). Within the flow
folder, core
contains the core language, including the compiler, and extended
includes all of the non-core forms extracted from the core (now defined as Qi macros in extended/forms.rkt
) and the (empty) expander.
The syntax.rkt
file in each of core
and extended
contains syntax classes that help in parsing core and non-core syntax, respectively.
The impl.rkt
file in each of core
and extended
contains the actual implementations (including helper code) for core and non-core forms, respectively.
aux-syntax.rkt
contains "auxiliary" syntax classes used in both core
and extended
forms (maybe this can be eliminated upon review down the line).
private/util.rkt
contains non-implementation-related utilities used anywhere.
The tests pass, but the docs don't build at the moment, and that's why CI is failing. I don't yet know what the failure is and haven't investigated, but it seems that identifiers provided from extended/forms.rkt
(i.e. the new Qi macros) aren't visible in the docs for some reason.
See y'all tomorrow 😄
I've started a new integration branch to hold the Qi compiler work and allow each PR to be bite-sized as much as possible. This one can't be merged into the main branch yet (since it degrades performance by a lot), but I think we can consider it complete (without considering it in any way "decided" as far as the core language goes) as a bounded component of building the compiler, so I will merge this into the integration branch.
Summary of Changes
This is WIP. The plan is to distill a small core language that would be amenable to characterization using the binding spec DSL, and so that compiler optimizations would be tractable.
This is work towards https://github.com/countvajhula/qi/issues/22 .
The present PR will likely not include any novel compiler optimizations, but it would be ideal to include implicit, restorative optimizations designed to map specific Qi -> Qi0 expansion rules back to their former Qi -> Racket versions, so that there is no performance penalty from merging this PR. That is, assuming there are N forms that are rewritten from their original Qi->Racket rules to Qi->Qi0 instead, then there are likely to be N optimization rules in this PR that will rewrite those specific Qi0 expansions back to the a priori Racket versions.
Public Domain Dedication