masak / alma

ALgoloid with MAcros -- a language with Algol-family syntax where macros take center stage
Artistic License 2.0
139 stars 15 forks source link

Implement ES6-style arrow functions #215

Open masak opened 7 years ago

masak commented 7 years ago

Preferably as a language extension.

Even I'm getting tired of writing

my squares = values.map(sub (v) { return v * v });

when I could instead write

my squares = values.map(v => v * v);

(Edit: But see comment below about ->.)

Note that I'm not proposing any new semantics; just a short form for sub terms.

I don't see why we shouldn't allow ES6's all four combinations: with and without parens (in the one-parameter case); with and without curly braces (in the one-return-expression case).

vendethiel commented 7 years ago

what about multi-arg subs? Add parens like ES6?

masak commented 7 years ago

Yes, the parens are required for everything except the one-parameter case.

masak commented 7 years ago

Know what the most exciting part of this issue is? That the macro would itself need to declare its arrow function parameters, bounded by the arrow body. (Regardless of whether there are braces around the body or not.)

My first instinct is that this should be part of the is parsed regex somehow. Possibly as a <{ }>-style code body. Doing it later than that is too late, since the parser will already have parsefailed.

The most exciting thing is that this would also start to answer the questions in #159.

Note to self: we need a "declaration protocol" to handle this.

masak commented 7 years ago

We'd need one <{ }> thing for "opening" a scope with declared things, and another <{ }> thing for closing it.

Assiduous followers of Edward A. Murphy Jr. will know that having these as two separate components is asking for trouble; someone will forget the closer, and sooner rather than later.

This suggests something like this instead:

<{ declare(identifiers, inner_regex) }>

That is, do an "inferior runloop" kind of thing with the declare() call wrapping the inner regex.

"But what about backtracking?" you say. Good point. Luckily, by the time we're declaring new identifiers in a scope, we need to already be beyond some point of no backtracking return. This should probably be a rule somewhere.

masak commented 7 years ago

Not sure => is the best choice. I think -> has two advantages. No, three:

  1. It's less confusable with >= and <=, two fairly normal comparison operators. The visual similarity between <= and => when they do so different things in the language — to the point of being in two completely different grammatical categories — is frankly offensive.
  2. Quoting TimToady loosely: "The -> is meant to be looked at with your head tilted, so it looks kind of like a λ."
  3. (I just realized:) 007 already has -> for blocks, and the connection between parameterized blocks and arrow functions is not frivolous.

(Java and JavaScript and C# opt for =>, whereas Perl 6 spells it -> (and Java), Perl 5 spells it sub, and Python spells it lambda.)

masak commented 7 years ago

Know what the most exciting part of this issue is? That the macro would itself need to declare its arrow function parameters, bounded by the arrow body. (Regardless of whether there are braces around the body or not.)

My first instinct is that this should be part of the is parsed regex somehow. Possibly as a <{ }>-style code body. Doing it later than that is too late, since the parser will already have parsefailed.

But... it's actually worse than this. We're running into a parsing problem that JavaScript already has, perhaps the most interesting one JavaScript has, in fact:

Consider these two statements:

say((x, y) -> x);
say((x, y));

The first one prints the anonymous function value generated by the arrow function. The second one doesn't parse, because comma is not an operator and (we assume) x and y weren't declared.

The heart of the matter is this: we need to declare x and y (and accept the comma in that position) for the arrow function, but the signal that tells us it is an arrow function is the -> which comes later.

(Oh! So that's why Perl 6 spells it -> x, y { x } with the arrow coming first, because the -> coming first really helps with parsing. Also — see TimToady quote above — that's where the λ'd go. Worth considering, perhaps.)

So we need to defer declaration a little bit, because it can't really happen until we've seen the ->.

Or, you know, we just give up and copy Perl 6 and front the -> and avoid the need for any deferral mechanism. Food for thought.

vendethiel commented 7 years ago

Although: Perl 6 has its own {} is a hash/block issue.

masak commented 7 years ago

If we went with the -> x { x } form, I think we'd have to require the block, just like Perl 6 does. Not because -> x x would be unparseable (just detect the TTIAR and consider the lambda body to start there) but because it looks ugly and strange.

Heh, or we take a leaf from actual λ calculus syntax, and spell it -> x. x 😛

masak commented 7 years ago

This is a rare language design case where my heart tells me one thing and my head another.

  1. I like the way x -> x looks. It's easy to take in at a glance, and the arrow is in a place that emphasizes the "functional mapping" from parameters to the resulting expression.

  2. It breaks the rule of "Always know what language you're in". Suddenly we'd have to give every undeclared variable or parenthetical expression the benefit of the doubt because it might be the prelude to a ->. (This is exactly the situation v8's JavaScript parser finds itself in, sometimes with ensuing hilarity.) If/when we do find a ->, we effectively have to go back and re-parse for the side effects. I think that even if we don't find a ->, we have to go back and re-parse for the side effects. Planting that kind of doubt into the parser feels very un-Perl 6-like.

On the gripping hand, we are talking about a language extension, so it's not like we're doing damage to 007 proper.

Hm. :smile:

masak commented 7 years ago

I think this is an instance of "why not have both"? We should be able to support both the nice, Perl 6-ish way and the intuitive but rule-breaking way.

[insert "por que no los dos?" image macro here]