tc39 / proposal-generator-arrow-functions

ECMAScript proposal: Generator Arrow Functions
114 stars 5 forks source link

Syntax #2

Open chicoxyzzy opened 4 years ago

chicoxyzzy commented 4 years ago

Possible solutions

Arrow function syntax

// Irregular
() =*> ...

// not the same order as in regular generator functions
() =>* ...

// also wrong order
() *=> ...

// ASI hazard
*() => ...

Introduce new generator keyword for both function and arrow function

generator function() {}
const foo = async generator function() {};

class Foo {
  x = 1
  // No more ASI hazard!
  generator foo() {}
}

Previous discussions https://github.com/tc39/proposals/issues/216

jridgewell commented 4 years ago

@littledan and I had some hallway-track conversations about this. My thoughts were to add a gen keyword (much like the async keyword), and treat the * sigil as the ugly wart it is:

gen function foo() {} // same as function* foo() {}
foo = gen () => {};

class Foo {
  x = 1
  // No more ASI hazard!
  gen foo() {}
}

When creating an async generator, the order would be async gen.

chicoxyzzy commented 4 years ago

There is an ASI hazard here

foo = gen ()
  => {};
jridgewell commented 4 years ago

That's a syntax error currently, right? I don't understand how it would cause a new ASI hazard.

MaxGraey commented 4 years ago

I can imagine only two variants: 1.

foo = (*)(arg) => {};

2.

foo = (*(arg)) => {};

second could be simplified for one / zero arguments as:

foo1 = (*arg) => {};
foo0 = (*) => {};
ljharb commented 4 years ago

Adding a new gen is interesting, but there's no value imo in abbreviating - why not generator? (it's not func or async fn etc)

MaxGraey commented 4 years ago

More radical solution is using different brackets like [] instead ():

foo1 = [] => {}
foo2 = [[first, second]] => {}  // with array destruction
foo3 = [a, b] => {}
chicoxyzzy commented 4 years ago

TBH, any of *=>, =>* or =*> look totally fine and clear to me.

bathos commented 4 years ago

*=> seems least surprising to me, with =*> seeming least obvious (it helps to still have ‘the arrow’ for recognition).

jridgewell commented 4 years ago

Adding a new gen is interesting, but there's no value imo in abbreviating - why not generator?

Either is fine with me.

*=> seems least surprising to me, with =*> seeming least obvious (it helps to still have ‘the arrow’ for recognition).

Least surprising, but still super ugly. I (as a relatively experienced JS dev) still struggle with the order for function * foo() {} ("is it before function, in between, or after the identifier?"). I imagine new devs are just as confused ("what does the star mean?").

A keyword would be less esoteric, and much easier to Google for. So extending generator support to arrow functions gives us a chance to tackle both with one proposal. 😃

rumkin commented 4 years ago

I vote for keyword gen and *=>.

rumkin commented 4 years ago

@ljharb

Adding a new gen is interesting, but there's no value imo in abbreviating - why not generator? (it's not func or async fn etc)

There is the answer in your question: as async is a shortening of asynchronous, thus gen could stand for generator.

janwirth commented 4 years ago

I vote for

const myGen = async gen () => yield await sth()
// or
async gen function myGen () { yield await sth() }

It I think it is more writeable readable than *:

Thoughts on oother programming languages

ljharb commented 4 years ago

@rumkin async is a well-known abbreviation for asynchronous, and is effectively a word on its own; gen is not.

janwirth commented 4 years ago

@ljharb - I agree. I also think it is easier to type and thus less useful to abbreviate.

Researches found that "Shorter identifier names take longer to comprehend" https://link.springer.com/article/10.1007%2Fs10664-018-9621-x

bathos commented 4 years ago

If a keyword is introduced for the arrow case, does it necessarily mean also introducing it for the ‘longhand’ case?

ljharb commented 4 years ago

For consistency, I would hope so.

janwirth commented 4 years ago

If we look for inspiration in python, we notice that they have no keyword to declare a generator - it is implied by yield

rumkin commented 4 years ago

https://link.springer.com/article/10.1007%2Fs10664-018-9621-x

@FranzSkuffka, I think this work isn't relevant for well-known language syntax. It's about new identifiers. Such identifiers are always located in brain's short memory and meaningful names help our brain to build an abstract model faster.

chicoxyzzy commented 4 years ago

Added generator keyword as a possible solution to the README.md and to the first message of this issue

inoyakaigor commented 4 years ago

How about const *asd = () => {}?

bathos commented 4 years ago

That’s between a const keyword and a const binding identifier. The arrow function is the part in the initializer, after the equals sign. These expressions can appear in many places, not just in const declarations.

Jamesernator commented 4 years ago

Is there any ASI hazard to *() => {} other than a lonely *() => {} expression statement? If not I don't really think it's compelling to discount that.

constb commented 4 years ago

hi! why not *>? like (x, y, z) *> { … } and async (x, y, z) *> { … }

jashsayani commented 4 years ago

Functions have gone from function foo(bar) {} to const foo = (bar) => {}.

Hence it feels like function* generator(i) {} would be const* generator = (i) => {} but this clearly opens up a can of worms (like people typing const* a = 1 and would annoy people from other languages familiar with pointers. So ignoring that, *() => {} makes the most sense intuitively.

bathos commented 4 years ago

@Jamesernator

IIUC that’s the only case, yeah. The following would parse as a MultiplicativeExpression with an invalid right hand side:

foo
* () => {};

The asterisk would match MultiplicativeOperator and the parentheses would match CoverParenthesizedExpressionAndArrowParameterList. If the covered part refined successfully to ParenthesizedExpression (e.g. * (a)), then failure would occur at =>; otherwise it would occur at the parens themselves. Either way, no arrow function.

That said, I wouldn’t have called this a hazard: it would throw a SyntaxError up front. It’s not a ‘trap’ like

foo
[bar]

The extent of the ‘hazard’ is just that a semicolon is necessary. The absence of one poses no risk of producing code that evaluates at all, much less with a different meaning from what was intended. (AFAICT)

jamiebuilds commented 4 years ago

Does the keyword option have to be a variant on the word generator?

When teaching them, the word "Generators" has caused a lot of confusion in how they relate to "Iterators". So I started referring to them as "Iterator Functions" and that seemed to help people understand their relationship.

So I think iterator could be a good alternative choice for a keyword.

But even better than that: "Iter" is already a well-established abbreviation for "Iterator" so the keyword iter could work:

iter function fn() {...}
async iter function fn() {...}
bathos commented 4 years ago

I would think that making the keyword for defining generator functions iter would increase, not decrease, confusion regarding the subset-superset relationship between generators and iterators, no?

jamiebuilds commented 4 years ago

Maybe, but as it stands right now the relationship isn't seen by many. Anecdotally, I've talked to developers who describe generators as "pausable functions" (not sure where this comes from) or "how async functions are implemented under the hood" (which I think comes from transpilers), and I've seen them use the iterator protocol directly with while loops and calling .next() instead of using for..of. I suspect generators are underutilized (not that they are something every developer would use daily) for this reason.

ljharb commented 4 years ago

A generator is a pauseable (synchronous) function. It also produces an iterator (but isn’t one). They are (sadly) the dominant implementation detail for transpiler output of async/await; i think that’s indeed where that one comes from.

given that the syntax is async function but it returns a Promise, the keyword seems to suggest what it is and not what it produces. Similarly, I’d expect a generator keyword not to be about the iterator it produces, but about what it is (a value generator).

bathos commented 4 years ago

I think I agree with the gist of the previous comment, but would point out that a ‘generator function’ produces a generator and that a generator is a specific kind of iterator. The function itself isn’t an iterator or a generator, at least not as the spec defines those terms.

ljharb commented 4 years ago

@bathos is it really a specific kind of iterator? despite the existence of observable distinguishing factors (method toStrings, prototype chain, etc), if you solely consume the iterator, how is it possible to observe a difference?

jamiebuilds commented 4 years ago

 A generator is a pauseable (synchronous) function.

Neither description is technically incorrect, but they miss the important detail that they always produce an iterator/iterable, and using the iterator protocol is the only way that you can use them (even if you are calling .next() directly). Also, I think that you'd find that "value generator" is somewhat meaningless when trying to explain generators to developers (most functions "generate" "values").

bathos commented 4 years ago

@ljharb Reviewing the defs, I’m probably incorrect, at least as far as more formal terminology goes. The iterator interface is defined in terms of required properties and optional properties; I’d have considered the ‘optional properties,’ which concern control flow, to be the things which qualify an iterator as ‘a generator.’ But the spec does not use the term generator in this section.

That said, the generator-function vs generator distinction is consistent, and is reflected in naming and inherited toStringTag values.

jamiebuilds commented 4 years ago

given that the syntax is async function but it returns a Promise

I'd also throw out there that async functions can be used without directly using the promise it creates, but generators cannot be used without using iterators:

async function main() { ... }
main() // will execute the entire function as it is written and many node programs do just this

function *iter() { ... }
iter() // never executes the function body, and there's no reason to do this.
bathos commented 4 years ago

By ‘without using iterators’, I take it you mean that there’d be no point in iter() if the result object is ignored, right? (Ignoring weird cases where evaluation of the arguments has its own side effects.)

I think what still confuses me with this name is that I would not have referred to a generator object which is being used for control flow as opposed to iteration an ‘iterator’, even though it technically also is one. Such generators may satisfy the definition of iterator, but they usually won’t actually work as iterators, because they typically expect values to be received through next(), unlike generators intending to be used for iteration, which ignore these arguments.

jamiebuilds commented 4 years ago

Yes, although fair point that calling a generator function is not completely unobservable without calling .next() (but that's well within the realm of edge cases).

Such generators may satisfy the definition of iterator, but they usually won’t actually work as iterators, because they typically expect values to be received through next(), unlike generators intending to be used for iteration, which ignore these arguments.

I'm not sure what you mean. Even if you are calling .next() directly without an argument or using the return value, you're still using the iterator interface, and I'm not sure I've seen any usage of generators that doesn't involve a loop (while (!done), recursion, etc) around .next() so any real-world use case will still resemble an "iterator" in the abstract.

bathos commented 4 years ago

Yeah. I think I understand your angle now, thanks. It seems I just must have a different working definition of what constitutes ‘iteration’ at a high level, because e.g. when using generators to implement recursive descent parsing I would find it confusing to call that control-flow usage ‘iteration’ (mainly because of bidirectional communication). But I don’t think you’re incorrect to call it that, either.

jamiebuilds commented 4 years ago

I'm sure you understand this already, but here's an example of using an iterator for recursive descent with a generator function: https://gist.github.com/jamiebuilds/7322021dd6584ef7d2cbd08cd637d683

bathos commented 4 years ago

Yep — I probably should have chosen a less broad example, I was referring to recursive descent parsing rather than visitation, specifically where yield is used to get the next terminal and yield FOO is used to return feedback (‘consume it’, ‘switch lexical mode’, etc) and return values are parse nodes. Coroutine-ish?

jamiebuilds commented 4 years ago

Even if you're returning some sort of op codes via yield you'd still have a loop at the top-level somewhere which receives the next op code and resumes. Here's another quickly thrown together example: https://gist.github.com/jamiebuilds/86eb0b64bd4a18950307211cf8297fb0

rumkin commented 4 years ago

I think @jamiebuilds is right and this comment is good point to remove function and replace it with generator (or iter) for generators.

weswigham commented 4 years ago

I think advocating for the full word generator here is missing (at least part of) the point. I use => over function in expressions because it is concise (the lexical this is a nice side-benefit, but literally never comes up when writing code in a functional style). I recently rewrote a large chunk of code from a normal recursive-calling form to yield*ing generators, and I would have loved being able to write () =*> yield* expr instead of function* () { return yield* expr; }. Literally any of the arrow-with-star patterns are fine (because every sequence of tokens is gobbledygook until it acquires meaning through exposure anyway). There's no intuition for any of them for anybody other than diehard language fans, and even then... there's no real intuition for where the * goes on generator functions already (and even less consensus on where the possible whitespace near the * should go!).

Sure, you can think that that means that using * to mark generators was a mistake; but unfortunately, it already does (and that's not going anywhere). Rather than adding another question to the language ("should I use * or generator"), I think we should be endeavoring to add an answer that roughly aligns closer with existing expectations. The real, unfortunate travesty is in not having delivered some form this alongside the initial iteration of either generators or arrows; because of that, nobody was able to start building an intuition on the correct syntax through exposure - instead we're all left in the lurch (rediscovering this stackoverflow post every time we forget the unsatisfying answer and search for it again).

monotykamary commented 4 years ago

Since we use yield* to delegate to another generator, I was wondering if we could use gen* () => {} as its concise enough to identify it as a generator and doesn't drop the asterisk for consistency.

hax commented 4 years ago

Another possible solution: allow do generator expression.

const iter = do *{
  for (;;) yield 42
}

So we do not need special syntax for generator arrow function any more, just

x => do *{
  for (;;) yield x
}
bathos commented 4 years ago

A generator function is one which returns a generator — I don’t understand what you have in mind there @hax. It seems like it would be a pretty chaotic refactoring hazard if a function’s signature changed based on whether it includes or doesn’t include a particular expression somewhere in its body.

hax commented 4 years ago

@bathos I don't understand what u mean "if a function’s signature changed based on whether it includes or doesn’t include a particular expression". For any arrow functions like x => expression, the signature of coz will be changed based on expression.

bathos commented 4 years ago

@hax I probably misunderstood what you intended that syntax to mean. I imagined you wanted to somehow kick into generator control flow at will (which wouldn’t have really been coherent), but I now see that you intend the expression to evaluate to a generator instance, being equivalent to (function * () {})().

hax commented 4 years ago

Yeah, that what i mean. Actually if do expression support async do { await x } and return a promise, we would also have x => async do { await... } which make async (x) => { await... } redundant 😉

b-strauss commented 4 years ago

If there would be a way to redefine everything I would do it like this, which is much clearer. Probably to late for that...

// single sync
function() {}
() => {}

// multi sync (generator)
sync* function() {}
sync* () => {}

// single async (promise)
async function() {}
async () => {}

// multi async (async generator)
async* function() {}
async* () => {}
funkjunky commented 4 years ago

I think advocating for the full word generator here is missing (at least part of) the point. I use => over function in expressions because it is concise (the lexical this is a nice side-benefit, but literally never comes up when writing code in a functional style). I recently rewrote a large chunk of code from a normal recursive-calling form to yield*ing generators, and I would have loved being able to write () =*> yield* expr instead of function* () { return yield* expr; }. Literally any of the arrow-with-star patterns are fine (because every sequence of tokens is gobbledygook until it acquires meaning through exposure anyway). There's no intuition for any of them for anybody other than diehard language fans, and even then... there's no real intuition for where the * goes on generator functions already (and even less consensus on where the possible whitespace near the * should go!).

Sure, you can think that that means that using * to mark generators was a mistake; but unfortunately, it already does (and that's not going anywhere). Rather than adding another question to the language ("should I use * or generator"), I think we should be endeavoring to add an answer that roughly aligns closer with existing expectations. The real, unfortunate travesty is in not having delivered some form this alongside the initial iteration of either generators or arrows; because of that, nobody was able to start building an intuition on the correct syntax through exposure - instead we're all left in the lurch (rediscovering this stackoverflow post every time we forget the unsatisfying answer and search for it again).

I really feel if ANY one implemented ANY of the syntaxes, we'd be able to at least get a feel for things.

This 4 year old debate is simply hurting the development of the language. We need a more concise form that provides the benefits of array function, choose any syntax you want that is short! [I think this disqualifies using the full keyword generator, just like asynchronous; shorten it to gen]