masak / alma

ALgoloid with MAcros -- a language with Algol-family syntax where macros take center stage
Artistic License 2.0
137 stars 14 forks source link

Make it possible to define the `amb` declarator #13

Open masak opened 9 years ago

masak commented 9 years ago

This is a tricky one. But awesome.

If we can override statement forms, then we can declare a new statement that starts with the amb keyword.

amb x <- [1, 2, 3, 4];
amb y <- [2, 3, 4];
assert x + y == 6;
assert x < y;
say(x);  # 2
say(y);  # 4

The code would translate internally to something very much like this:

for [1, 2, 3, 4] -> x {
    for [2, 3, 4] -> y {
        if !( x + y == 6 ) {
            next;
        }
        if !( x < y ) {
            next;
        }
        say(x);
        say(y);
    }
}

This one is tricky because amb is more global than local: it changes the code from where it's used to the end of the block. (Where do those for loops end? When the block the amb was used in ends.) Which kind of implies also hijacking/overriding the behavior of block/codeunit compilation.

Alternatively, maybe amb should be restricted to only be used in a certain block, call it ambable:

ambable {
    # put all your ambs here
}

Then we'd be in nested macros territory. I bet the book that vendethiel++ has recommended me will have more answers on this.

Similarly (though less radically), the assert keyword (which we don't have yet, but which could be defined using die (which we don't have yet) in a macro) would need to be hijacked too, since it gets converted to nexts in the generated code. Again, this might be easier to pull off inside an ambable block. Otherwise, the check has to be "did we already see an amb in this block?"

It may turn out that we as a (Perl 6) community decide that something like amb is too tricky to be a macro. (Especially as it does control flow.) And that it should instead be a slang. Note, though, that this is just a distinction; it doesn't provide any further insights into how it should work implementation-wise.

Mouq commented 9 years ago

As a forward, this response is written using a slurry of Perl 6, 007, and psuedocode:

It seems kind of like amb is a macro that takes a statementlist, which might simplify things... however, let's assume it isn't to keep things interesting ;)

One of the things that makes this really interesting is the declaration of the variable, and how that interacts with macro hygiene, which has been a continuing, and, AFAIU, far-from-closed discussion.

For instance, provide that we could access the outer ast somehow (speaking of nested macros...). Perhaps something like some $*OUTER.wrap_ast functionality could be provided that would be given a callback for after the outer statementlist ast is finished being made by the actions and earlier .wrap_asts.

Even with that super-hypothetical functionality, it isn't enough:

macro amb(var-q, list-q) {
    $*OUTER.wrap_ast: -> outer { quasi {
        for $list-q -> $var-q {
            $outer # or {*} or something...
        }
    }}
}
my x = 42;
{
    amb x <- [1, 2, 3, 4];
    assert x == 4;
    say(x); # 42
}

The compiler doesn't know about var-q until too late, and it's already been bound to the outer x. In fact, even the possibility that outer can use var-q here would break quasi hygiene, as far as I understand.

This means that x has to be added at parse time. Okay, no problem if we have some way of poking code into the parser (just making one up now; the semantics aren't important):

macrorule amb {
    $<0>=<.new-var <ident>> '<-'
    $<1>=<.EXPR>
}
slang token new-var ($rule) {
    <var=$rule>
    { $*runtime.declare_var($<var>) }
}

Great, now x will be declared as being part of the inner scope! (or in other cases, a redeclaration might be thrown by .declare_var). Let's try our psuedocode again:

my x = 42;
{
    amb x <- [1, 2, 3, 4];
    assert x == 4;
    say(x); # undefined
}

Oh no! Now we have yet another complication; the var x was already declared by the body of the for loop that the macro wrapped the statementlist with. There's probably some hacky way to work around this; luckily, we have the Q:: classes at our disposal:

macro amb(list-q) {
    $*OUTER.wrap_ast: -> outer {
         Q::Statement::For.new(list-q, outer)
    }
}
macrorule amb {
    <.new-var <ident>> '<-'
    $<1>=<.EXPR>
}
my x = 42;
{
    amb x <- [1, 2, 3, 4];
    assert x == 4;
    say(x); # 4
}

Aaand, look, it's actually simpler! But wait... outer is a statement-list. Now the question becomes whether it's okay to blur the differences between a block and statement-list. But there's a lot of ideas here already and I think I've gone on enough for now :)

masak commented 9 years ago

Note that this blog post introduces "AST transformers". It doesn't really explain them or give a concrete example, but they way I envision them, they would address the issue Mouq points out above.

It would be really fruitful, I think, to brainstorm for an hour or so how AST transformers might look in order to both introduce a new variable declaration of x in the block and also introduce a new inner block which can then be backtracked out of.

masak commented 8 years ago

Coming back to this one, I think a combination of syntax macros and visitor macros could get us there nowadays.

Would be interesting to write up a complete solution to this, and then trying to support it from within 007.

masak commented 7 years ago

Having just re-read the SEND+MORE=MONEY blog post, I think we should steal back the syntax for that one.

That is, amb turns from a statement into a function macro:

amb x <- [1, 2, 3, 4];       # old form
my x = amb([1, 2, 3, 4]);    # new form

I could be persuaded to drop the parentheses, but then it'd also be slightly harder to parse.

Two or more amb calls can now be in the same expression. They'll just be processed in order. If we feel uneasy about amb being able to show up everywhere, not just in expression statements, then we can outlaw them in other places.

And I do like guard much better than assert, which feels like it has connotations from other languages, and is connected with stopping the program, not with killing off a branch of non-deterministic computation.

eritain commented 7 years ago

The more recent notion of the Iterable ( #148 ) would combine powerfully with amb.

masak commented 5 years ago

Now that my is a term, I wonder if <- shouldn't simply be our syntactic way to specify amb. That is, <- is the "amb cousin" of =.

That is, the syntax would be

amb x <- [1, 2, 3, 4];       # old form
my x = amb([1, 2, 3, 4]);    # new form
my x <- 1, 2, 3, 4;          # new new form

I know I'm going back and forth a lot with syntax on this one, but I believe it's for the better.