zenparsing / js-classes-1.1

https://zenparsing.github.io/js-classes-1.1/
83 stars 3 forks source link

Consider making left operand of -> optional #17

Closed allenwb closed 6 years ago

allenwb commented 6 years ago

@BrendanEich said: "The this-> is tedious and cries out for shorthand."

We could make this on the left of -> optional by adding a production like to the grammar:

PrimaryExpression : -> IdentifierName

However, that by itself would introduce an ASI hazard:

class C {
  var x, y
  constructor(a, b) {
    ->x = a   /* ASI hazard */
    ->y = b
  }
}

This can be eliminate by adding a line terminator restriction to: MemberExpression : MemberExpression [no LineTerminator here] -> IdentifierName CallExpression : CallExpression [no LineTerminator here] -> IdentifierName

This is consistent with how similar ASI hazard are already handled.

littledan commented 6 years ago

We split off the shorthand into a separate proposal for private fields because it seemed like a lot of people developed misleading mental models of the syntax ("great, it's lexical") and because we were concerned that another case where you have to think about this could be confusing, e.g., for debugging. Do those concerns not apply due to the arrow syntax?

allenwb commented 6 years ago

I don't think we should "kick things down the field" in this effort. Either decide we want to include a certain feature or not. Part of the goal here is to finish the work on classes.

Personally, I don't think I'd use the PrimaryExpression abbreviation for ->. I generally choose readability over writability. But I don't have serious concerns with this abbreviation.

I'm not sure exactly what your "great, it's lexical" concern was (and it may be a more valid concern for #18). I would always teach about hidden names by starting with the binary form and stressing that the left operand supplies the object context for the right operand. Only then (and after practice using the binary form) would I introduce the unary form and strongly emphasize that it is just an abbreviation of the binary form for when the left operand is literally this.

I also think the difference between -> as an operator and # as a prefix makes a difference. The symbology of an arrow suggests something on the left going towards the thing on the right. I expect this might cause novice readers to think a bit more about what belongs on the left when nothing is there. # on the other hands doesn't naturally suggest the need for anything on the left.

littledan commented 6 years ago

I don't think we should "kick things down the field" in this effort. Either decide we want to include a certain feature or not. Part of the goal here is to finish the work on classes.

I'd like to reach that goal too. Several times in the development of these proposals (including on this particular question about shorthand), I tried to make statements like, "we won't do this in the future", including this feature. Sometimes, I was able to get consensus on these things, but other times, I wasn't able to, even if I personally might judge that we're better without the feature. The only way I saw to get consensus while specifying a reasonable feature was to make design decisions but also make sure that space is available to expand things in various other directions. I'm not sure how to do do otherwise with a group of highly opinionated people and a process based on absolute consensus.

I'm not sure exactly what your "great, it's lexical" concern was (and it may be a more valid concern for #18). I would always teach about hidden names by starting with the binary form and stressing that the left operand supplies the object context for the right operand. Only then (and after practice using the binary form) would I introduce the unary form and strongly emphasize that it is just an abbreviation of the binary form for when the left operand is literally this.

This is really the core of the question. The pedagogical idea that several people had was the opposite of what you describe: that #x was the "normal" form that you'd learn first. Then, occasionally, for the advanced edge cases, you'd learn that you can actually do obj.#x for the edge cases. Frequently, in the description of the teaching model, I heard people (including multiple long-time TC39 members) say that it seemed like #x would be like a lexically scoped variable, and that's a positive quality, since people already know how to manipulate lexically scoped variables. I didn't get the sense that it would be easy to convince people to see the implicit this in #x. Anyway, maybe ->x is different because it actually has a visibly binary operator in it, so maybe it implies this more strongly.

About the ASI hazard mitigation: I find this syntax a little surprising, given how . is often idiomatically put on the next line. For example, what if people used hidden methods in a chaining pattern, e.g., (apologies for being a bit silly) the following example, which would just silently operate on the wrong things. You might run into this issue either because of a long chain of several hidden methods or variables, or because the arguments for a hidden method are syntactically heavy.

  class BSTNode {
    var parent, leftChild, rightChild;
    hidden operate() {
      this->leftChild
          ->rotateLeft()
          ->rightChild
          ->rotateRight();
    }
    hidden rotateLeft() {
      /* ... */
      return this;
    }
    hidden rotateRight() {
      /* ... */
      return this;
    }
  }
zenparsing commented 6 years ago

@littledan Great example!

I think that, with the hidden access operator ->, we should strive to support the same newline patterns that are common for the dot operator.

allenwb commented 6 years ago

Note that the same problem, in theory, exists for unary + and - but in practice I guess it isn't a problem because of how they are used.

I find this a pretty compelling example that suggests we can't have the primary expression form, and that in general we can introduce any new operator that can be used as both a binary and an unary operator.

allenwb commented 6 years ago

The pedagogical idea that several people had was the opposite of what you describe: that #x was the "normal" form that you'd learn first. Then, occasionally, for the advanced edge cases, you'd learn that you can actually do obj.#x

Not how I'd teach it. I'd start with this->x as the normal form, then talk about cases where you might say obj->x and only after all that introduce ->x as a short hand for this->x. First teach the form that builds the correct conceptual model, then move on to short-cuts.

BrendanEich commented 6 years ago

Standard.js and other such semicolon-lite style checkers flag missing ; before \n before + or - or / or ( or [. Maximal munch means -> is one token not two, so such checkers would have to be extended to complain about it along with the other punctuators that are both infix and prefix forms.

littledan commented 6 years ago

Not how I'd teach it.

Me neither. From here, the question is, how do we coordinate among teachers that they explain the more accurate mental model? And, by their judgement, how teachable/learnable is it?

allenwb commented 6 years ago

@BrendanEich so, the argument is that this hazard already exists for other operators and so really won't be a problem for people who use a tool-supported semi free style. And it's not a problem for people how use semis. It's still a problem for new people who don't know the pitfalls and don't have the tools.

I guess that's just the JS world we've all helped to create.

allenwb commented 6 years ago

@littledan I'm not sure that's our responsibility, except when we're giving conference talks and such. Whatever we do there will be good teacher and bad teachers, got books and bad, ...

BrendanEich commented 6 years ago

@allenwb Yes, I made this point at the last TC39 meeting. JS has two semicolon styles, both require parsing linters for any serious project. Old news, people can and do deal.

@littledan We can't pick teachers but we can analyze Law of Least Effort and other human-factors laws, whether by simulating usage or doing real user research. If teachers have to gradgrind their students away from attractive nuisances and least-energy paths through a design space, we've done something wrong already.

zenparsing commented 6 years ago

I think that in this particular case, a prefix -> (assuming no newline restriction for binary ->) represents a little more than an ASI hazard. I would say that it's mildly hostile to the semicolon-light coding style, owing to the fact that this is used at the beginning of statements very frequently.

To expand on the example in the OP, I don't think developers coding with "standard.js" would want to write this:

class C {
  var x, y, z
  constructor(a, b) {
    ->x = a
    ;->y = b
    ;->z = 0
  }
}
allenwb commented 6 years ago

Historical trivia, The self language is called "self" because it's major syntactic deviation from Smalltalk is that expressions like

self.display.
self doSomethingWith: anObj.

could be abbreviated like:

display.
doSomethingWith: anObj.
BrendanEich commented 6 years ago

@zenparsing Touché, although it looks better to put semi at end in this case -- which of course totally undermines Standard style!

In DM'ing with @littledan, we both noted the recurring desire (if not "hard requirement", Dan said that phrase, I think)) for a shorthand expressed by TC39, which was used to argue against @. So we (re-)considered all of this-> (shorthand crippled at start of statement as you say), # (honestly I think the aesthetic objections are non-trivial, but also the pseudo-lexical criticism hits its target), and @ as both decorator introducer in context, and private name shorthand prefix in other contexts.

In interest of widening that DM discussion, let me copy one thing I wrote there: "Good point about @ which is more aesthetic and has Ruby => CoffeeScript precedent". DM conversation continued:

""" @BrendanEich Shorthand idea: unify grammar harder so @x = and @x in middle of expression are this.@x, decorator only at front of larger forms.

@littledan Oh that's a matter of giving ExpressionStatement a no-shorthand lookahead and that's it

@littledan Almost any accident will be an early error

@BrendanEich yes

@littledan The disadvantage is having an edge case to think about

@littledan Anyway I still suspect people would just get over # if we force it on them

@BrendanEich I think aesthetics matter, also arguments from nearby languages """

The "giving ExpressionStatement a no-shorthand lookahead" solution does require this.@x = y in an expression statement, no @x = y. I could see a more complex grammar that unifies decorators and @ for private shorthand, but it's work and it may break consensus via grammatical complexity. Pausing here. Feedback welcome.

Update: Apologies for issue-jacking a bit, we can certainly close this as wont-fix. I'm interested in helping in London in two weeks' time, so want to consider all options and constraints to put best foot forward. Can we abuse this issue for a little while to do that?

BrendanEich commented 6 years ago

@allenwb yes, Self's name rationale is another bit of quasi-evidence that this->, like self. before it, is too long for the common case. If TC39 will insist on a shorthand, then we should add one to this proposal, somehow, to avoid bouncing.

littledan commented 6 years ago

In these threads, we're getting at something that we found out in the development of the .# proposal: This shorthand avoids ASI hazards specifically by being two tokens, that you break down the middle. With no token, you get the problems of complicating lexical scope. With the whole token, you have a very prominent ASI hazard, and even if we recommend the use of linting tools, it'll still look ugly one way or another if you're in no-semicolon mode (either incongruously requiring this or semicolons). These rules apply regardless of what we use in between the object and the instance variable name name.

With half a token, if you have a good story for why that's happening (e.g., "you include the # since it's part of the name"), you win on both counts. This was a big reason why we settled on obj.#x rather than obj#x (or, the same dot when we were considering @).

I'm not sure if we should go against many people's strong intuition that we shouldn't use #, but ultimately, whatever we choose, we're talking about two punctuation symbols, one after the other, where one of the two requires you to press shift on a US keyboard.

allenwb commented 6 years ago

@BrendanEich of course the vast majority of Smalltalk programmers have been perfectly fine with the explicit self. And, having to explicitly this. qualify property accesses doesn't seem to have impeded the adoption of JavaScript.

the recurring desire (if not "hard requirement", Dan said that phrase, I think)) for a shorthand expressed by TC39,

I seems like we're edging back towards the situation TC39 is already in with the current extended class definitions proposals and the same situation we found ourselves in starting in 2009 until we were able to form consensus around the max-min class proposal. As soon as we move beyond the most minimal set of functionality, we open the door for a multitude of sometimes conflicting features whose inclusion is based upon personal preferences rather than functional necessity. Preferences get expressed as requirements and ultimately consensus threatening "hard requirements" . At that point TC39 either deadlocks or starts playing the piecemeal, kick the ball down the field game with proposal proponents trying to get the pieces approved that will set them up for achieving their ultimate preference.

There are many additional features that various TC39 members would prefer to be part of this proposal or part of any set of extensions to JS class definitions. To have any chance of success with this proposal we will have to resist them. Arguably, the only unique thing in this proposal is its focus on a very small set of use-cases that require engine level support. If we can't sell and stick to this minimalism I think this effort will fail.

BrendanEich commented 6 years ago

@allenwb JS has had no alternative to this.foo so it is hard to prove much. However, adding private names causes people who naively assume static types-with-member-names to ask "why can't I just use the name"? (Also on right of dot in obj.foo where they assume obj has private foo). So it seems to me we cannot avoid the shorthand demand, but is it a requirement from TC39? I don't know. We will have to discuss in London.

In any event, I expect this-> will get push-back both for being even longer, and for introducing -> (which may be coveted for known or unknown future uses, and which has distinct meaning in C). I'm not saying anything new here, just that we should avoid trying to float a lead balloon. @littledan's last comment does shed some important light, I think: shorthand requires two tokens, one shifted.

allenwb commented 6 years ago

@BrendanEich I'm sure that @zenparsing 's idea in #18 for supporting unqualified use of hidden names when their object context is this would be very popular among the community. But it would still require a qualifier other than . because of the well understood ambiguities of obj.foo where we don't know anything about obj and foo might be either a hidden name or a property key.

I had originally suggested in #5 using .. as the hidden name qualifier (eg, obj..foo is a hidden name access, obj.foo is a property access. I still think it may be a reasonable fallback alternative to ->.

I don't think that fact that -> means something different in c/c++ is a concern. After all, the meaning of . is also not exactly the same as JS for those languages.

BrendanEich commented 6 years ago

Again I'm not saying anything new or trying to trigger rehashing :-D. Just pleading to get ahead on the two big objections we'll face: why -> is any better than .#; and "Dude, where's my shorthand?"

Sorry I missed " @zenparsing 's idea for supporting unqualified use of hidden names when their object context is this" -- link?

zenparsing commented 6 years ago

@BrendanEich here: #18

zenparsing commented 6 years ago

Closing in preparation for public review. Feel free to continue discussion here; we may choose to re-open at a later time.

wycats commented 6 years ago

@allenwb yes, Self's name rationale is another bit of quasi-evidence that this->, like self. before it, is too long for the common case. If TC39 will insist on a shorthand, then we should add one to this proposal, somehow, to avoid bouncing.

Notably, Ruby, the main other Smalltalk descendant, also made self optional in self.foo(), and also has an instance variable shorthand of the form @foo and @foo = expr.

In the status quo proposal, I agreed to separate the shorthand into its own proposal because I didn't think it was in conflict with the rest of the proposal, and didn't think the rest of the proposal significantly foreclosed the shorthand.

I still feel strongly that we should eventually consider the shorthand again, and disappointed that this proposal seems less of a good fit for a shorthand (for the standard.js hostility reason Kevin raised in this thread and the confusion with lexical scoping raised in #17).

In addition to the issues already raised in this thread, I find ->x to be a less suitable shorthand than Ruby's @foo or the previously proposed #foo.