tc39 / proposal-class-fields

Orthogonally-informed combination of public and private fields proposals
https://arai-a.github.io/ecma262-compare/?pr=1668
1.72k stars 113 forks source link

Issues with ASI #7

Closed leobalter closed 6 years ago

leobalter commented 7 years ago

The current grammar for FieldDefinitionLists requires a closing semicolon:

https://tc39.github.io/proposal-class-fields/#sec-updated-syntax

ClassElement [Yield, Await] :
  ...
  FieldDefinitionList [?Yield, ?Await] ;
  static FieldDefinitionList [?Yield, ?Await] ;

I feel like it should optional, but maybe it's intentional and I'm here to request clarification.

Thanks

bakkot commented 7 years ago

It's optional because of ASI. It shouldn't be optional in general - class A { x y z(){} } is not what we want, I think.

leobalter commented 7 years ago

🔥 🔥 🔥 🔥 🔥 🔥 🔥 ASI 🔥 🔥 🔥 🔥 🔥 🔥 🔥

leobalter commented 7 years ago

btw, ASI has some weird behavior here:

class C {
  foo = 'TC39';
  [1]

  bar = 'TC39'
  [2]

  prop() { console.log( this.foo, this.bar ) }
}

new C().prop() // 'TC39', '3'
bakkot commented 7 years ago

Yeah, and a whole bunch of other cases. I think this snippet covers most of the major surprises:

class A {
  a = 0
  [b](){} // SyntaxError

  a = 0
  *b(){} // SyntaxError

  async // property named `async`
  a(){} // method named `a`

  get
  a(){} // getter for `a`

  get; // property named `get`
  a(){} // method named `a`

  a = 0; // property named `a`, initialized to `0`
  in // property named `in`
  b; // property named `b`

  a = 0 
  in
  b; // property named `a`, initialized to `0 in b`
}
borela commented 7 years ago

It's not wierd, ASI rules are simple, MPJ did a really good video on the topic: https://www.youtube.com/watch?v=Qlr-FGbhKaI

leobalter commented 7 years ago

I want a declaration list of fields, I'm not really interested in any of the cases @bakkot or I presented here. Saying that ASI is simple puts you in a privileged PoV we can't share with other regular developers.

My examples are not common cases I expect, but I would be pretty annoyed in a case like this:

class C {
  a = 0
  *b(){} // SyntaxError
}

I still need time to fill myself with context from the notes and issues as I believe the champions might have had thoughtful discussions on this. I don't want to assume it's going to be easy to simply get rid of semicolon and ASI here.

borela commented 7 years ago

I understand the concern, but I believe that the traps of not knowing the ASI rules are far more problematic specially during the development phase where the code might not have the proper format yet, for example:

function a() {
    return [
        'a',
        'b',
        'c'
    ];
}

function b() {
    return
    [
        'a',
        'b',
        'c'
    ];
}

There are a lot of examples of beginners having "strange" bugs due to not knowing the ASI rules and I always recommend that video to them; I got bitten a couple of times by ASI even in projects that required semicolons before I decided to finally learn how it works.

littledan commented 7 years ago

@leobalter I agree; that case is pretty annoying. It's hard for me to think of what a better alternative would be, though. We've gotten pretty strong feedback from developers that they don't want to have to start putting semicolons after their declarations. Other languages use an explicit \ where JS relies on ASI logic. Another thing would be to require parens around a multiline expression in an initializer. The downside is that these would be a little incongruous to do just in class bodies. If you have any further ideas, I'd be interested.

borela commented 7 years ago

@littledan Why not make ASI a bit more intelligent by adding a rule for class methods.

littledan commented 7 years ago

@borela Sure, what would the intelligence be exactly?

borela commented 7 years ago

@littledan The problem there is that ASI is treating the * as a multiplication operator instead of generator modifier, all it has to do is to look at the right hand side of the * for method declaration, then, if a method is found you can safely assume that the * was used to modify that method and a semicolon can be added before it.

bakkot commented 7 years ago

@borela That would arbitrary lookahead, which is not something we generally allow in the grammar. In particular, you can't distinguish between multiplication and a method until you've gotten to the brace:

class A {
  x = 0
  * y (a1, a2, ...a3) // multiplication
}
class A {
  x = 0;
  * y (a1, a2, ...a3) {} // method
}

Similarly for distinguishing between a computed name and a computed property access.

borela commented 7 years ago

Yes, the idea would be to look for [...](...){ or something(...){ after the * just to give it the correct meaning inside classes.

bakkot commented 7 years ago

So the proposal would be that

class A {
  x = 0
  * y (a1, a2, ...a3)
}

would parse as a class field followed by a generator method, and then be a syntax error due to the lack of a function body? That would be something very different from ASI as it currently stands; I don't think that would fly.

borela commented 7 years ago

No, if a { is not found after the ), you could assume it is still part of the expression, like you said, it's a look ahead to check the meaning of the *.

bakkot commented 7 years ago

Right, so distinguishing between a method definition and an expression requires reading arbitrarily far ahead. As I say, I doubt that would fly.

(Technically we did do something similar with arrow parameters, actually, but it's caused a lot of pain in implementations, and I'm not sure how analogous the situations are.)

And it still doesn't solve the problem, anyway: is the following two fields, or one field whose initializer is an array access? Either would be syntactically valid.

class A {
  x = 0
  [y]
}
borela commented 7 years ago

I would consider it two fields, I don't see why I would ever put [] after a numeric constant. But I know it wouldn't be too simple to code.

Another option is to drop ASI completely, @leobalter was right that the rules are not that simple, after talking to other programmers I realized ASI is a bit unnatural to many people.

bakkot commented 7 years ago

@borela Sorry, a better example would have been x = z.

Dropping ASI is an option, but we've heard a lot of pushback on that idea from people who are accustomed to the no-semi style, who are after all the only people who'd have to deal with the complexity. And, well, making new exceptions to ASI is also kind of confusing.

(Personally I'm approximately neutral on that question, but I don't ever use the no-semi style, so I'm not a good person to ask.)

borela commented 7 years ago

I used no semicolons in my sources because of ASI, I had too many gotchas after years of C++; by not using semicolons I force myself to read my code more carefully and prevent stuff like returning undefined when I meant to return something; I would be ok if there was an option to at least disable ASI.

leobalter commented 7 years ago

I didn't have time to explore the previous discussions yet or the proposal's history, but as this is already being discussed, I have had a slight idea of removing the ; rather than the ASI.

The problem is that ; can already be a ClassElement, regardless this proposal, so removing the ; from FieldDefinitionList is not an option here.

I agree the lookahead issue would be a potential blocker to move this proposal towards to Stage 3.

One thing we might try is allowing a trailing comma in the FieldDefinitionList:

FieldDefinitionList[Yield, Await]:
  FieldDefinition [?Yield, ?Await]
  FieldDefinitionList [?Yield, ?Await] , FieldDefinition [?Yield, ?Await]
  FieldDefinitionList [?Yield, ?Await] ,

This way we can separate a class field from a method in a better visual way:

class C {
  x = 42,
  *gen() {}
}

This should work just fine as the FieldDefinition expects an AssignmentExpression as the Initializer, not Expressions.

I'm reopening this issue to allow more feedback from other people. Feel free to close it if this is noisy.

littledan commented 7 years ago

@bakkot Arrow functions already have a lot of this complexity, but it's not tied into ASI, so I agree, this would not be great.

@leobalter Thanks for the thread name change. Well, this sounds like it's getting back into the ES2015-era discussion, should we use , or ; between class elements. To me, it seems pretty weird to allow a comma after a field declaration, but not between two method declarations. Why is trailing comma easier to write than a semicolon?

leobalter commented 7 years ago

My suggestion only adds the trailing comma, it already exists between fields in this proposal.

, feels better in my PoV the class will have a declaration list, rather than ;. I assume this preference is subjective.

littledan commented 7 years ago

Declaration lists don't generally permit trailing commas. Should we allow them for var, let, etc declarations as well?

littledan commented 7 years ago

Ping, any further ideas on this thread?

leobalter commented 7 years ago

I don't believe this should force a trailing comma for var, etc declarations, considering this declaration list also allows the uncommon class method definitions.

zenparsing commented 7 years ago

For me, @littledan has the important point:

Well, this sounds like it's getting back into the ES2015-era discussion, should we use , or ; between class elements. To me, it seems pretty weird to allow a comma after a field declaration, but not between two method declarations.

We've already decided on semicolons between class elements.

Although, I think it would be best to remove the comma-list option from class fields. It feels bizarre to me for some reason.

leobalter commented 7 years ago

Although, I think it would be best to remove the comma-list option from class fields. It feels bizarre to me for some reason.

This is an alternative I support as well. It feels like we can have this whole feature and discuss using comma later if it's necessary.

caitp commented 7 years ago
class C {
  foo = 'TC39';
  [1]

  bar = 'TC39'
  [2]

  prop() { console.log( this.foo, this.bar ) }
}

new C().prop() // 'TC39', '3'

I don't think the spec should be adding more ways for developers to shoot themselves in the foot. New features should avoid this bad behaviour, even if it's inconsistent with the rest of the language.

There should always be a clear beginning and end to class member definitions, which is not confused by the inclusion or exclusion of line terminators.

I would personally be unwilling to implement the ASI for this in v8, and would happily remove the existing implementation before shipping the feature.

littledan commented 7 years ago

Is it reasonable to expect developers to use semicolons in class bodies? I think ASI was a mistake in the language design, but a lot of developers seem to believe it's important to be able to use ASI, even though it's clear that there are plenty of significant downsides to taking advantage of it.

littledan commented 7 years ago

Separately from this design decision, I hope we can stick with the trend of working out the semantics in the standards committee, so thanks for bringing up your concerns here, @caitp

claudepache commented 7 years ago

We've gotten pretty strong feedback from developers that they don't want to have to start putting semicolons after their declarations.

Question: Are the aforementioned developers aware of the new hazards (those mentioned in https://github.com/tc39/proposal-class-fields/issues/7#issuecomment-309879686 above) caused by ASI in class body?

Personally, although I really like ASI, I am willing to abandon it inside class bodies, because the list of special cases I have to pay attention there is quite ... convoluted. Besides the case of generators already raised in another comment, I am concerned with:

class A {
    foo // property named `foo`
    b(){} // method named `b`

    get
    a(){} // getter for `a
}

It is really new that I am now sometimes forced to put a semicolon before a statement/declaration (here, a(){}) beginning with an identifier...

littledan commented 7 years ago

@claudepache I think there are two classes of ASI hazards:

So, I hope these both "not that bad". I'm not sure how to consult further with users here; any more comments on this thread would be welcome.

Although I'm not a fan of ASI generally, we've gotten a lot of feedback that developers are not willing to give up ASI for class fields in other bugs. For example, https://github.com/tc39/proposal-class-public-fields/issues/26 .

bakkot commented 7 years ago

@littledan For completeness, also computed property names after fields with an initializer. Still analogous to what happens outside of class bodies, though.

claudepache commented 7 years ago

Here is my detailed analysis, why I think that, while relying on ASI with a simple rule (but not a simplistic (and generally incorrect) rule found somewhere) is reasonable and safe (until now), that becomes much more cumbersome inside a class body.


Outside class declarations, if you want to rely on ASI in a responsible manner,

  1. you must watch out statements starting with some punctuation marks that may be used either as prefix or infix (namely ( [ / + - and more recently `).
  2. you ought to know that all other potential ASI-not-working hazards are resolved by a [no LineTerminator here] grammatical rule. (This is an important feature that needs a somewhat extensive study of the grammar to be formally proven.)

As a result, if you rely on ASI, it suffices to add a semicolon at the beginning of standalone statements starting with one the aforementioned punctuation marks and you are safe.

In fact, with this rule, you are probably safer than if you add a semicolon at the end of each statement, because it is easier to look for relatively rare (in that position) punctuation marks that needs to be semicolon-prepended at the beginning of a line, than to spot a potential missing semicolon at the end of each statement. My goal here is not make publicity for a coding style, but to show that, with just one easy-to-follow rule, it is (uh, was) unproblematic.


Now, let’s look what happens inside class body. In increasing order of worrysomeness, the cases of ASI-not-working are:

Also, I don’t think that the fact they look like keywords will be a strong deterrent to use them. Just look on how the methods of the Map builtin are named.


Another point to take into consideration: if we keep ASI in ClassBody now, future added word-like token will require to not be followed by line terminator, whereas some already existing tokens don’t have that restriction, introducing some inconsistency. Suppose, for the sake of the argument, that we add an optional internal modifier to declarations. For the sake of BC:

class C {
    internal // semicolon automatically added
    static foo() { }
}

whereas, for historical reason:

class C {
    internal static  // no semicolon added
    foo() { }
}

(*) A line terminator restriction is needed after async in async function declarations and expressions. That restriction was apparently just replicated elsewhere, e.g., in extended object literals, where there is neither ASI interference, nor function keyword.

bounceme commented 7 years ago

you can at least differentiate *|in|instanceof when they come after a method (or are the first "statement thing"), using lookbehind. asi would apply then, right?

class a {
b()
{}
in(a)
{}
}

// vs the completely different meaning of:
function a() {
b()
{}
i(a) // cant use `in`
{}
}

edit: it would, my issue is i still think of ~regular function bodies~ javascript instead of class.

bounceme commented 7 years ago

how about just using something besides =

then disallowing a; as a definition

littledan commented 7 years ago

We discussed ASI for field declarations at the September 2017 TC39 meeting. We didn't come to an agreement one way or another. I'll bring up the issue in the November meeting for further discussion.

thepeoplesbourgeois commented 7 years ago
random suggestion gremlin:

I realized this morning that this idea might ruin dot-chaining, and a significant portion of the rest of ECMAscript … and I have no idea how hard it might actually be to get implemented… but what if infix operators/keywords were forbidden at the beginnings of lines within class bodies? That way you could enforce the unary meanings of overloaded operators (*) for fields, and wouldn't require a lookahead to do so.

claudepache commented 7 years ago

@thepeoplesbourgeois The case of * is probably one of the least concerning, because a missing semicolon at the end of the declaration preceding the generator definition will almost surely trigger a syntax error at the level of the next {. So,

Also, your suggestion does not resolve the issue:

(1)

class C {
    foo  // semicolon inserted here
    bar() { }
}

(2)

class C {
    get  // no semicolon inserted here
    bar() { }
}
littledan commented 7 years ago

I see your point @claudepache , and it seems pretty inherent to anything we do here with ASI. If you want to be able to omit semicolons anywhere, you should be able to do it after a field declaration which has no initializer.

But I'm not sure how much the "field named get/set/static" issue will come up. I think get and set make more sense semantically as methods, rather than uninitialized fields, and static is more of a weird name for a field. We could add future keywords within classes similarly to async, with a no newline restriction.

thepeoplesbourgeois commented 7 years ago

@littledan captured my line of reasoning pretty succinctly. I had actually believed get, set, and static were keywords, like async, all of which I would expect to take effect on the next declaration when left unterminated by a semicolon.

I guess the potential issues this would open up center around forward extensibility; what becomes a keyword in future specs would need immeasurably more consideration in the event that it might already have gained common use in a framework. It's not difficult to imagine the usefulness of something like render to denote a function intended for canvas-like interaction, for instance.

... then again, this would already have needed to be a consideration for const, let, async, await, etc., so maybe it's a solved problem

littledan commented 7 years ago

The extensibility point is really huge, and @waldemarhorwat emphasized it to me after I presented on ASI previously. If we allow ASI, it has cross-cutting effects for any changes we want to make in the class grammar; the most obvious one is no more get-style keywords in classes which don't prohibit a newline afterwards. Aside from that, it just has to be thought about for any sort of change in the class grammar--how does this interact with ASI? It's a little worse than outside classes since we don't make as heavy use of keywords.

thepeoplesbourgeois commented 7 years ago

Experimenting with different newline combinations in the Babel REPL shows that async actually follows a different set of ASI rules than get, set, static, and *. The latter keywords will take effect on the declaration on the next line, with static being chainable with get and set, but not being chainable with async and *. * pleasantly surprised me by working as the infix multiplication operator and as the generator mark as necessary, though.


// PRESETS: es2015, es2017, react, stage-0

class C {
  get() {}                           // method

  something() {}                     // method

  async somethingElse() {}           // async method

  async 
  somethingYetEvenMoreDifferent() {} // method

  get 
  anotherThing() {}                  // getter

  static 
  set 
  anotherThing(param) {}             // class setter

  static() {}                        // method

  * 
  yetAnotherThing() {}               // generator

  foo = 5 *
    2                                // == 10
}
littledan commented 7 years ago

@thepeoplesbourgeois That's right. We will have to go 'async style' with all future keywords within classes if we do this sort of ASI.

jridgewell commented 7 years ago

We will have to go 'async style' with all future keywords within classes if we do this sort of ASI.

I'm in favor of that. I think the current ASI rules for get/set/static/* are a bit silly.

littledan commented 6 years ago

We discussed ASI in the November 2017 TC39 meeting. The committee's conclusion was that we should maintain ASI in class fields, but be more explicit in the specification with notes about what using ASI means and the risks of doing so. I'm preparing a patch which does this.

littledan commented 6 years ago

Closing per resolution in https://github.com/tc39/proposal-class-fields/issues/7#issuecomment-352550833

zenparsing commented 6 years ago

@littledan The comment above at https://github.com/tc39/proposal-class-fields/issues/7#issuecomment-335939597 seems to indicate that ASI in the context of class fields is future hostile in the sense that we won't be able to add any member modifiers without peppering newline restrictions all over the place.

Was there consensus that such newline restrictions are acceptable?

ljharb commented 6 years ago

@zenparsing yes, there was (specifically, consensus that future language additions would no longer be held back by asi concerns alone)

zenparsing commented 6 years ago

There are a couple of places where ASI is not applied, for good reasons (e.g. for heads). At some point I'd like to discuss further whether a class field declaration (with no initializer) should warrant such an exception.