masak / alma

ALgoloid with MAcros -- a language with Algol-family syntax where macros take center stage
Artistic License 2.0
137 stars 15 forks source link

Introduce reduce metaops #176

Open masak opened 8 years ago

masak commented 8 years ago

I think I just stumbled over our first slang.

I started with thinking up a way to is parsed reduction ops:

macro term:reduce(op: Q::Infix, terms: Q::Expr[])
    is parsed(/
        "[" <infix> "]"
        <.ws>
        <EXPR>* %% ["," <.ws>]
    /) {

    # Ignoring the zero-op case for now -- exercise for the reader
    my ast: Q::Expr = terms[0];
    for terms[1..*] -> term: Q::Expr {  # assuming [1..*] syntax
        ast = quasi { {{{ast}}} {{{op @ Q::Infix}}} {{{term}}} };
    }
    return ast;
}

Wow, I love this "build-your-own-AST" (last seen in #134).

But then I thought about it a bit more. The above would handle hard-coded elements just fine:

say([+] 1, 2, 3);           # 6

But there would be no mechanism (short of quasi hackery at compile time) to inject an array in there. Which might even be a fairly common case. Perl 6 has prefix:<|> for that. I think 007 would go the JavaScript route and use ...:

my values = [1, 2, 3];
say([+] ...values);         # 6

So now we end up with something like this:

is parsed(/
    "[" <infix> "]"
    <.ws>
    ["..."? <EXPR>]* %% ["," <.ws>]
/)

But hold on. Unlike in #134, we just took a first step down a dangerous garden path: we did with is parsed what could very well be done with the expression parser itself!

Hence, slangs. For the purpose of this issue at least, a slang is a textually scoped, extended parsing environment. In this particular case, the slang is defining two operators:

infix:<,> (which might want list associativity, by the way)
prefix:<...>

I don't have any great idea about the actual underlying mechanisms, but it seems that the requirements involved are clear enough:

So... a slang is... a Q class? With an associated is parsed? I don't know if that will work, but I kind of like the sound of that.

Another thing about infix:<,>: note how we used %%, which means this should work:

my sum = [+] 1, 2, 3,;      # note extra comma
say(sum);                   # 6

If we introduce the comma as an infix, there's currently no way to do that. Maybe together with list associativity we should also introduce is lastable, or something like that.

vendethiel commented 8 years ago

do we not have an issue for is parsed? I was wondering if is parsed should always be installed at EXPR level.

masak commented 8 years ago

It gets installed in the category of your choice, for example term or infix.

masak commented 8 years ago

Oh hey, wait. <EXPR>* %% ["," <.ws>] — where have I seen that before?

By some remarkable coincidence, that's the argumentlist rule in the 007 grammar. (It's written a bit differently, but it comes down to the same.)

So, yes Virginia: we can get our commas for free!

is parsed(/
    "[" <infix> "]"
    <.ws>
    <argumentlist>
/)

Two immediate objections:

But no, it's re-using <argumentlist>. As seen in Perl 6, the reduce metaop is a callable thing, taking arguments:

$ perl6 -e 'say [~](1, 2, 3)'
123

And if we implement #178 separately, then we're back to the proposed syntax of this issue:

$ perl6 -e 'say [**] 2, 3, 4'
2417851639229258349412352

And prefix:<...> is by rights something that should be defined so that ordinary calls/arguments benefit from it too. Let's say the <argumentlist> rule was instead this:

<argument>* %% ["," <.ws>]

And the new argument category was defined by default as

rule argument:expr { <EXPR> }

Then we could get our prefix:<...> analogue like this:

rule argument:spread { "..." <EXPR> }

Which feels better anyway, because now we don't have the problem of someone using the prefix twice in a row, or somewhere non-toplevel.

masak commented 8 years ago

Of course, a real implementation of this macro would also take into account the associativity (or lack thereof) of the infix operator.

masak commented 8 years ago

And, as it turns out, we can take the <argumentlist> idea one step further — by not parsing that bit at all.

All we really want is a term that resolves to something that you can call, either with parentheses or as a listop, and it Does The Right Thing:

use rest_parameters;
macro term:reduce(op: Q::Infix) is parsed(/"[" <infix> "]"/) {
    my name = ...;   # something nice and introspectable
    return quasi {
        (sub {{{name @ Q::Identifier}}}(*values) {
            return values.reduce(sub (accum, val) {
                return accum {{{op @ Q::Infix}}} val;
            });
        })
    };
}

...I believe that's our first case of two-layers-of-sub-between-quasi-and-unquote.

The parentheses around the outer sub are there to turn it from a Q::Statement::Sub into a Q::Term::Sub. Our goal isn't to define a function (and then hygienically throw it away); it's to generate a first-class function value that can then be called. Statements stubbornly don't have values in 007.

masak commented 8 years ago

prefix:<...>

Actually, I don't remember my own decisions/speculations, it seems: we've already decided in #112 to go with * for this (from Python and Perl 6, instead of from JavaScript). That's for parameters, but it makes sense to use it for argument spread too (like Python, unlike Perl 6).

Instead of going back and fixing this whole issue, I kindly ask the reader to please just hallucinate that any ... really says * above.

masak commented 8 years ago

Here's a shorter version:

use rest_parameters;
macro term:reduce(op: Q::Infix) is parsed(/"[" <infix> "]"/) {
    my name = ...;   # something nice and introspectable
    return quasi {
        (sub {{{name @ Q::Identifier}}}(*values) {
            return values.reduce({{{op.identifier}}});
        })
    };
}
masak commented 5 years ago

The term:reduce rule has as its declarative prefix [... as does the built-in term:array. The place where they differ is in the next atom, where the former has <infix> and the latter has <expr>.

It's lucky that these two are unambiguous. In the core language. But then someone comes along and declares term:<*> (for their every Whatever and WhateverCode needs)... and then there's an ambiguity. It's a "latent ambiguity".

When and how is this ambiguity resolved? I don't know. I feel [*] should be interpreted as a term:reduce, even in the face of a term:<*>.

This section of S02 seems to indicate that this is a place where look-ahead more than one "longest token" is employed. I... that makes sense, I think. So maybe we need to do it too.

masak commented 5 years ago

We haven't even talked about [\+] yet. I wouldn't mind it. Dunno if it should be in a separate module? Probably not.