Closed countvajhula closed 2 months ago
PS I still get compilation errors in qi-sdk at the tip of this branch (or on the deforest-all-the-things branch).
The errors are different, though:
range
is already required from qi/list (line:column is 23:9)@benknoble I've fixed both build errors, thanks!
Interestingly, the move to using Qi syntax in the higher order position instead of a unary lambda comes with a significant performance cost in the long-functional-pipeline
benchmark. This makes our long-planned "arity inference" compiler pass even more compelling, since this case would be caught by building a table of known built-in function arities, as we've talked about @michaelballantyne and @dzoep
This is ready for review. I've updated the PR description with the list of changes.
One change to discuss is that since range
is now a form rather than a function, the partial application, fine template, blanket template behavior supported for ordinary functions doesn't implicitly translate. Do we feel that range
should continue to exhibit this behavior even though it's now syntax? I believe it would mean that we would need to maintain a distinct implementation of this behavior from the usual one used with functions. Currently, the range
form accepts no input arguments and expects the range to be specified syntactically using one, two, or three arguments. We can get equivalent behavior to using partial application or templates by using bindings instead, e.g.
(~> (3) (as v) (range v 10))
One change to discuss is that since
range
is now a form rather than a function, the partial application, fine template, blanket template behavior supported for ordinary functions doesn't implicitly translate.
Which behavior would be gone?
Currently, the
range
form accepts no input arguments and expects the range to be specified syntactically using one, two, or three arguments.
Maybe I'm misreading "currently"—does that refer to range
in Qi's main branch or on this branch? If the former, then I assume that would be the behavior that disappears without extra work. It still begs the question: what "forms" does the new range syntax support? In other words, what's the overlap and what's missing?
Sorry, I've been using "formerly" and "currently" in confusing ways :)
On the main branch, we support
(~> () (range 0 3)) ; => '(0 1 2)
(~> (0) (range 3)) ; => '(0 1 2)
(~> (0 3) range) ; => '(0 1 2)
(~> (0 3) (range _ _)) ; => '(0 1 2)
(~> (0 3) (range __)) ; => '(0 1 2)
(~> (0 3) (range _ _ 1)) ; => '(0 1 2)
(~> (0 3 1) (range _ _ _)) ; => '(0 1 2)
and so on. This is because range
is just an ordinary function, and we can use partial application, fine-grained templates, and blanket templates with it as with any function. But now range
is a form/macro. So currently in this branch, it accepts no arguments, and we only support:
(~> () (range 3)) ; => '(0 1 2)
(~> () (range 0 3)) ; => '(0 1 2)
(~> () (range 0 3 1)) ; => '(0 1 2)
We could use bindings to replace templates, like:
(~> (0 3 1) (as low high step) (range low high step)) ; => '(0 1 2)
How do we feel about this?
But now range is a form/macro. So currently in this branch, it accepts no arguments, and we only support:
I think it's a bit odd that range
expands to a delayed thunk, I guess, but as long as that only affects Qi space…
Does this change happen only when using qi/list
or always? If always, I would consider that a breaking change. (How to handle it is another matter.)
It's only with qi/list
. One goal for this PR is to uphold a clear separation between host language and Qi, so the intent is that we will not modify the behavior of Racket forms in the future (including Racket's range
, which can still be used as before without qi/list
, or, presumably, also by using (require (except-in qi/list range))
-- but in either case, Racket range
would not be included in deforestation).
I think it's a bit odd that range expands to a delayed thunk, I guess, but as long as that only affects Qi space…
Is there another behavior you'd prefer?
It's only with
qi/list
. One goal for this PR is to uphold a clear separation between host language and Qi, so the intent is that we will not modify the behavior of Racket forms in the future (including Racket'srange
, which can still be used as before withoutqi/list
, or, presumably, also by using(require (except-in qi/list range))
-- but in either case, Racketrange
would not be included in deforestation).
Hm. It might be annoying to have different enough semantics that qi/list
isn't drop-in, but that's already the case with map
et al., so I think this is fine and probably the right way to go.
I think it's a bit odd that range expands to a delayed thunk, I guess, but as long as that only affects Qi space…
Is there another behavior you'd prefer?
Sorry, I'm probably mixing concerns here. I think this touches on the semantics of range as a flow, though: I think I'd expect something like (using syntax-parse for convenience of notation)
(define-syntax-parse-rule (range args ...)
(lambda more-args (apply range args ... more-args))
That still (probably) doesn't get you to template support, but it does allow more variations, right? I'm probably getting mixed up. I could see things like (~> (n) range more ...)
being common enough that having to write (~> () (range n) more ...)
instead feels awkward (there's no "principal" or "subject" in the form, except there is).
In any case, I'm happy with the current version (since it's segregated to qi/list
), and I'd like to see future efforts try to make the syntaxed range
work a little more normally. Incremental improvement, as they say.
(A piece of the puzzle: how is range
connected to deforestation?)
Yes, we could add support for more arguments and the usual templates over time (maybe even in the near future) if we want. One thing to note is that the many permutations of range syntax was responsible for a lot of complexity in pattern matching, and in the runtime implementation too IIRC (some of which is still in the code and is probably unused at the moment). It would be worth seeing whether the change to #%deforestable
would end up simplifying that, or if we would still end up introducing that kind of complexity in order to support templates, etc.
re: how range
fits into deforestation, if you had a sequence like this:
(~> () (range 100000000) (filter odd?) (map sqr) car) ;=> 1
If range were not deforested, then we would generate the first list in full before the deforested portion begins at the filter
stage. But if it's included in deforestation, this entire sequence would only process a single element without building the entire list even at the range stage, so it allows us to start the "stream" earlier. Is that what you meant?
Ah, sorry: by "connected" I meant "where are the relevant pieces of implementation." The translation of range to a thunk doesn't help me see how a later pass can look for range to optimize it, which is one reason I'm confused.
Ah, well on the second point, we are now only optimizing Qi core language expressions, not host expressions as we do on the main branch. So, we would never optimize it further once it has already been compiled by the deforestation pass -- at that point, it's now the Racket compiler's responsibility. If on the other hand the deforestation pass does not optimize a range
form, then a subsequent pass would still see it as something like (#%deforestable range (...) (...))
, so it can still optimize it, e.g. if it turns out it is (range k k)
and is just an empty list it could be replaced with null
. Finally, the code generation pass would convert any remaining (#%deforestable range ...)
to a thunk (in the current implementation in this PR), and then it is handed off to Racket in any case.
On the former point, where are the pieces of the implementation, I think @dzoep is most familiar with that.
But since he is on vacation on the coast of France 🧑🎨 🚗 🏖️ , I'll take a stab at it! I'd look in core/.../deforest/syntax.rkt
for the syntax classes that match range
and other fusable components to decide if they can be deforested. And in cps.rkt
, fusion.rkt
for the compiled runtime provided for these fused streams. I'm sure I'm missing some spots but they should all be under flow/core/compiler/deforest/...
.
You may be amused by the latest two commits @benknoble and @dzoep . Now I'm wondering if I should squash the two commits and make it look like I did it right the first time, or retain the actual sequence of events revealing my confusion 😆
I'm still confused: isn't (flow (~> (filter odd?)))
a runtime error when invoked on a list?
@benknoble filter
is a form now rather than a function application, so it's invariant to threading direction (just like ><
, pass
, etc.)
To clarify, then: does that mean that with qi/list it's not possible to flow a function into a filter on a list, or both a function and a list into a filter?
@benknoble Yes, that's right. We could potentially support the kind of positional behavior that we have for regular functions with templates and so on, do you think we should? It might add a lot of complexity but I'm not sure. At that point, we should probably evaluate whether we should be supporting positional behavior across the board for all Qi forms (and not just functions). If we can find a simple, maintainable, way to do it, that could be interesting.
At that point, we should probably evaluate whether we should be supporting positional behavior across the board for all Qi forms (and not just functions). If we can find a simple, maintainable, way to do it, that could be interesting.
I'm not sure I'm going that far.
does that mean that with qi/list it's not possible to flow a function into a filter on a list, or both a function and a list into a filter?
Yes, that's right. We could potentially support the kind of positional behavior that we have for regular functions with templates and so on, do you think we should? It might add a lot of complexity but I'm not sure.
It just seems like this means requiring qi/list
requires (potentially) a lot of changes to use, rather than being close to drop-in, because of various compatibility concerns like this. That's fine, but it may discourage change. New modules could use it without concern, while modules not using it may hesitate to switch. I suppose a possible on-ramp for existing modules is to futz with imports so that I could do (for example) racket:filter
. This comment probably ties in with my concerns about range.
Again, from the perspective of "qi/list
is a separate library and can do what it wants," well, by all means, go be bold ;) I just want to point out I'm unlikely to immediately benefit due to the porting effort required, despite qi/list
being in the same family as qi
.
@benknoble I'd be open to supporting the full complement of template syntax, _
, __
, and everything else, if we are comfortable with the cost of that, viz. maintenance burden of a parallel implementation to the usual one for partial application of functions. It just would mean that we would need to spell out the various forms and expansions in the macro definitions, and also our pattern matching in the compiler to apply optimizations (e.g. all the "vindaloo" curries that are there right now, producer curry, etc. 😋 On the other hand, if we decide not to support templates (as with other Qi forms that aren't functions), we could remove all of these things. We could potentially support your specific example of identifier-only usage, like (~> (odd? '(1 2 3)) filter)
, without supporting templates, if we feel that's a useful mid-point).
But also, that doesn't need to be part of this PR necessarily. The present PR's main purpose is to restructure qi/list
to be based on the new #%deforestable
core form, so that the work on deforestation can then continue. I think this PR is already at that point, so it could make sense to merge this in soon, and we could continue the syntax of these forms as a longer-term discussion on the integration branch. Wdyt?
Any other comments on this PR?
@benknoble We discussed more today in the meeting and said that for some forms (like list-oriented forms), for values computed at runtime, it would be preferable to support templates rather than rely only on bindings, so we brainstormed a possible scheme for defining such forms. I didn't catch all the details but I've summarized it in the meeting notes.
We felt that it would be good to merge the present PR before venturing down these directions though, and then continue against the integration branch.
Further comments welcome, and thank you!
But also, that doesn't need to be part of this PR necessarily.
We discussed more today in the meeting and said that for some forms (like list-oriented forms), for values computed at runtime, it would be preferable to support templates rather than rely only on bindings […]
We felt that it would be good to merge the present PR before venturing down these directions though, and then continue against the integration branch.
Oh, yes: that's all perfectly fine RE: merging to continue work. I probably should have deferred the design discussion to a more appropriate place than a PR review :)
If on the other hand the deforestation pass does not optimize a range form, then a subsequent pass would still see it as something like (#%deforestable range (...) (...)), so it can still optimize it, e.g. if it turns out it is (range k k) and is just an empty list it could be replaced with null. Finally, the code generation pass would convert any remaining (#%deforestable range ...) to a thunk (in the current implementation in this PR), and then it is handed off to Racket in any case.
Re-reading, this seems to be the essence of what I needed for the thunk question: that translation is the "last fallback," if you will. I'm still surprised it's nullary and won't consume further arguments, but the idea that multiple passes first get a stab but you need a final, unoptimized implementation too makes sense.
It's hard to say what the right place for a design review is - I'm glad you raised your concerns somewhere @benknoble 😄
@benknoble @dzoep Any other feedback? If not, then I think we can probably merge this tomorrow.
@benknoble re: the runtime for range
being nullary, that will likely change when we support templates (discussed in last week's meeting).
Merging now, thank you!
We were formerly doing deforestation against host language bindings. That is, sequences of host-language functions that were deforestable from Qi's perspective were matched in the Qi compiler and provided deforested implementations. As these were host functions, they did not require us to specify a runtime in Qi.
This PR introduces an explicit core language form
#%deforestable
, intended to express all deforestable operations within the language. These deforestable operations are now Qi macros inqi/list
expanding to the#%deforestable
core form, and we then match sequences of#%deforestable
in the compiler in order to apply deforestation, keeping compilation formally contained within Qi. These macros also support Qi syntax in higher-order positions.Finally, we must provide a runtime for
#%deforestable
in the code generation step, to capture any such operations that were not deforested. Currently, we explicitly encode the runtimes for each operation, likemap
,filter
, etc., and this achieves the end-to-end behavior we are looking for. Going forward, we would like to define a more generic syntax for#%deforestable
, along with an API (a set of macros, maybe resemblingdefine-qi-deforestable-producer
,transformer
,consumer
) that would allow us to specify the runtime via this interface so that we don't need to encode the runtime directly in the compiler.Summary of Changes
#%deforestable
core formmap
,filter
, and other existing deforested operations inqi/list
range
syntax to accept only syntactic arguments (at least for now)qi/list
qi/list
Migrated from dzoep/qi#2 to use the new integration branch.
Public Domain Dedication
(Why: The freely released, copyright-free work in this repository represents an investment in a better way of doing things called attribution-based economics. Attribution-based economics is based on the simple idea that we gain more by giving more, not by holding on to things that, truly, we could only create because we, in our turn, received from others. As it turns out, an economic system based on attribution -- where those who give more are more empowered -- is significantly more efficient than capitalism while also being stable and fair (unlike capitalism, on both counts), giving it transformative power to elevate the human condition and address the problems that face us today along with a host of others that have been intractable since the beginning. You can help make this a reality by releasing your work in the same way -- freely into the public domain in the simple hope of providing value. Learn more about attribution-based economics at drym.org, tell your friends, do your part.)