Closed countvajhula closed 10 months ago
The problem occurs because of an interaction between the way that the tests are written and the fact that we made the nonterminal
syntax property non-preserved.
When applying a syntax property with syntax-property
, there's an optional third argument that indicates whether the property should be preserved in serialized syntax, i.e. when a compiled file is written out by raco make or racket -y.
If a syntax property is really only needed temporarily for compilation, making it non-preserved can avoid increasing the size of compiled files. So as of right now we made the nonterminal
property non-preserved.
In the tests as written, phase0-expand-flow
expands the initial syntax at expansion time of the test module, but the remainder of the test (eg. calling deforest-pass
and deforested?
) happens at run-time. That is, this test:
(deforested? (syntax->datum
(deforest-pass
(phase0-expand-flow
#'(>< (~>> (filter odd?) (map sqr)))))))
Expands to something like:
(deforested? (syntax->datum
(deforest-pass
#'(amp
(thread
(#%blanket-template
((#%host-expression filter)
(#%host-expression odd?)
__))
(#%blanket-template
((#%host-expression map)
(#%host-expression sqr)
__)))))))
and then that runs at runtime. The problem is that because the nonterminal
property is not preserved, when running with racket -y
the property is lost after expansion time and is thus missing when running deforest-pass
at runtime.
I suggest instead running the entire compiler pipeline at expansion time, and only returning a boolean to runtime. This makes sure that the compiler runs in the same way it will in a real program---at phase 1, and without any step of serialization between expansion and compilation. Here's a test that works file with both racket
and racket -y
:
#lang racket/base
(require
(for-syntax racket/base racket/string)
qi/flow/core/compiler
qi/flow/core/deforest
;; necessary to recognize and expand core forms correctly
qi/flow/extended/expander
;; necessary to correctly expand the right-threading form
qi/flow/extended/forms
(submod qi/flow/extended/expander invoke)
syntax/macro-testing
rackunit)
(begin-for-syntax
(define (deforested? exp)
(string-contains? (format "~a" exp) "cstream")))
(check-true
(phase1-eval
(deforested? (syntax->datum
(deforest-pass
(expand-flow
#'(>< (~>> (filter odd?) (map sqr)))))))))
@michaelballantyne Ok, thanks for the explanation! That definitely helps to understand what's going on. I'll have to think about how to refactor the tests to do the checks in phase 1, and I think if I can get this test to pass, that should help reveal how to address the original problem with the long-functional-pipeline
benchmark.
I've fixed that specific test to run in phase 1, verified that it is capable of detecting the failure to propagate nonterminal
across normalization → deforestation, and then added the fix, which was to simply attach the property to the top level flow expression that is being compiled.
I'm assuming that:
nonterminal
property after normalization, without first checking whether the expansion had that property attached to begin with.find-and-map
constructs transformed syntax using datum->syntax
, passing the input syntax for the srcloc
and prop
arguments (thanks @benknoble for reminding me of this!), that every component syntax object that is modified already propagates the nonterminal
property, so that it's only the toplevel expression that a priori doesn't propagate the property (which it now does).Does that sound right?
One general impression is that although the addition of the syntax property appeared a simple solution at first, it does make things a bit harder to reason about. We should revisit this at some point (after the release 😄)
Now I need to push the remaining tests into phase 1, and I think the PR should be ready at that point.
One "Interesting" thing @dzoep : long-functional-pipeline
now does get deforested (yay!), but it's only about 1.5x faster than Racket, when I think we are expecting about length-of-the-pipeline times faster, in this case 6-7, depending on whether values
is normalized away by Racket as we do it).
- Since
find-and-map
constructs transformed syntax usingdatum->syntax
, passing the input syntax for thesrcloc
andprop
arguments (thanks @benknoble for reminding me of this!), that every component syntax object that is modified already propagates thenonterminal
property, so that it's only the toplevel expression that a priori doesn't propagate the property (which it now does).
Isn't find-and-map pretty crucial to the compiler or expander? I'm actually very shocked it needs to use datum->syntax this way, then, and think that suggests a possible [architectural? design? implementation?] flaw (because I would expect Racket macros and other compile-time code to not need that kind of low-level tool in most cases).
I'll have to really study the code [again] to figure out why it's needed though, so I'm mostly saying this now as a way of "please publicly hold me accountable to investigate after the release."
I'm also not seeing any changes to find-and-map in this PR; was that in a previous PR or am I missing something?
Well, I think find-and-map
was a simple way to do a syntax tree traversal that happened to suffice for our needs (until we hit this particular issue!). My understanding is that the "right way" would be to do a proper core grammar aware tree-traversal, using something like syntax-parse
(which I believe this TODO notes, and which I think you're getting at), but if we did that, we would essentially be duplicating the entire core language grammar in the compiler, which we've already notated in the expander. I think we're hoping that Syntax Spec would be able to infer such traversal utilities that could be used down the line (correct me if I'm wrong Michael) from the core grammar specification, to avoid this kind of duplication. And in that case, it would be unnecessary effort to implement such a Qi-specific traversal now since we would get it "for free" later anyway in generic form.
And yes, I should have clarified that this PR doesn't modify find-and-map
, just normalize-pass
which uses it.
I'll gladly hold you accountable to implement the traversal if Syntax Spec doesn't get around to it quickly enough 😝
I think a generic syntax traversal using datum->syntax
is a totally reasonable thing to do here, but it should probably ultimately live in the syntax-spec library rather than in the Qi repo.
In case I happen to have time to pole around: does syntax-spec expose the grammar anywhere in a way that I can compute over it? Or where in its internals should I look?
In case I happen to have time to pole around: does syntax-spec expose the grammar anywhere in a way that I can compute over it? Or where in its internals should I look?
It does not expose it, and currently the only thing it generates from the grammar is the macro expander so it also does not have a super clean internal representation of the grammar either. The syntax classes for matching the grammar declarations are here:
And the generation of the expander from the grammar is here:
Yesterday we noticed that CI was failing, specifically on BC versions 8.5 through 8.8. We generally agreed that supporting BC is not a priority, so I've just updated the test matrix to reflect the actual current compatibility -- that is, BC/CS 8.9+, and CS all the way back to 8.5 (i.e. unchanged from before).
These are the specific errors, for reference, in case anyone thinks they might be cause for concern:
BC 8.5-8.7:
write: cannot marshal value that is embedded in compiled code
value: (srcloc #<path:/home/runner/work/qi/qi/qi-lib/flow/core/normalize.rkt> 74 7 2297 22)
compilation context...:
/home/runner/work/qi/qi/qi-test/tests/qi.rkt
context...:
/usr/share/racket/collects/compiler/private/cm-minimal.rkt:808:8
BC 8.8:
mask-accessor: contract violation
raco setup: 1 making: <pkgs>/rackunit-abbrevs/scribblings
expected: mask?
given: #f
compilation context...:
/home/runner/.local/share/racket/8.8-bc/pkgs/rackunit-abbrevs/private/test-typed-rackunit-abbrevs.rkt
context...:
/usr/share/racket/pkgs/typed-racket-lib/typed-racket/types/overlap.rkt:48:0: overlap?
I think this PR is ready to go! Any review appreciated. I'll aim to merge it soon if there are no further comments.
The first of those two errors seems like something we could potentially fix; it’s complaining that you are embedding a srcloc structure value in compiled code. We could probably avoid doing that easily.
The error message for the second suggests it’s a problem in the rackunit-abbrevs dependency, so less straightforward for us to change.
On Sat, Dec 30, 2023 at 2:33 PM Siddhartha Kasivajhula < @.***> wrote:
Yesterday we noticed that CI was failing, specifically on BC versions 8.5 through 8.8. We generally agreed that supporting BC is not a priority, so I've just updated the test matrix to reflect the actual current compatibility -- that is, BC/CS 8.9+, and CS all the way back to 8.5 (i.e. unchanged from before).
These are the specific errors, for reference, in case anyone thinks they might be cause for concern:
BC 8.5-8.7:
write: cannot marshal value that is embedded in compiled code value: (srcloc #<path:/home/runner/work/qi/qi/qi-lib/flow/core/normalize.rkt> 74 7 2297 22) compilation context...: /home/runner/work/qi/qi/qi-test/tests/qi.rkt context...: /usr/share/racket/collects/compiler/private/cm-minimal.rkt:808:8
BC 8.8:
mask-accessor: contract violation raco setup: 1 making:
/rackunit-abbrevs/scribblings expected: mask? given: #f compilation context...: /home/runner/.local/share/racket/8.8-bc/pkgs/rackunit-abbrevs/private/test-typed-rackunit-abbrevs.rkt context...: /usr/share/racket/pkgs/typed-racket-lib/typed-racket/types/overlap.rkt:48:0: overlap? I think this PR is ready to go! Any review appreciated. I'll aim to merge it soon if there are no further comments.
— Reply to this email directly, view it on GitHub https://github.com/drym-org/qi/pull/143#issuecomment-1872611088, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAK46U6EBUGMGXMQODFLL5TYMCCEJAVCNFSM6AAAAABBF6AQFWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZSGYYTCMBYHA . You are receiving this because you were mentioned.Message ID: @.***>
Yeah, if it's an easy fix we might as well do it. From googling the error, it sounds like it might be a case of "3d syntax" which I gather is best avoided as it loses the separate compilation guarantee, which I've come to really appreciate as the codebase gets larger!
I haven't looked into the srcloc issue yet, but so as not to hold up the other fixes in this PR, I've just modified the test workflow to run tests on BC 8.5 but not block other tests from running if it fails, so we see that the issue remains but it doesn't break CI. We can fix the remaining issue in a separate PR, so I'll merge this now!
Summary of Changes
This began as an investigation of the
long-functional-pipeline
benchmark, which it turned out wasn't getting deforested. That seems to have something to do with the fact that that benchmark is rewritten by both of our current optimization passes, normalization and deforestation, and most likely, the latter doesn't see thenonterminal
syntax property ~for reasons that are somewhat mysterious~ because we weren't explicitly propagating it, and the syntax property being "nonpreserved" meant that our tests were not correctly handling this property across phases (see comments below).In the course of investigating I ended up refactoring a lot of tests, and they actually revealed a lot of other smaller issues that are mostly fixed. ~But the main nonterminal property issue remains a mystery~. Summary below:
racket
, but fail when usingracket -y
. In the latter case, thenonterminal
property is not present after expansion.compiler/rules.rkt
that passes if we manually introduce thenonterminal
property between passes viatag-form-syntax
compiler.rkt
module itself, at the conclusion ofnormalize-pass
, does not make a difference to the test result. Adding the property at thefind-and-map/qi
level (after excluding non-syntax and expressions tagged#%host-expression
-- which it didn't like) doesn't make a difference to the test result either.Other changes:
nonterminal
propertyutil
module to (compiler)pass
. There are too many modules named "util" and obviously we should avoid such a generic name.Public Domain Dedication
(Why: The freely released, copyright-free work in this repository represents an investment in a better way of doing things called attribution-based economics. Attribution-based economics is based on the simple idea that we gain more by giving more, not by holding on to things that, truly, we could only create because we, in our turn, received from others. As it turns out, an economic system based on attribution -- where those who give more are more empowered -- is significantly more efficient than capitalism while also being stable and fair (unlike capitalism, on both counts), giving it transformative power to elevate the human condition and address the problems that face us today along with a host of others that have been intractable since the beginning. You can help make this a reality by releasing your work in the same way -- freely into the public domain in the simple hope of providing value. Learn more about attribution-based economics at drym.org, tell your friends, do your part.)