Make :: operator a better version of . operator tailored for methods

tc39 / proposal-bind-operator

This-Binding Syntax for ECMAScript

1.75k stars 30 forks source link

Make :: operator a better version of . operator tailored for methods #42

Closed InvictusMB closed 7 years ago

InvictusMB commented 7 years ago

The idea is to extend semantics of binary :: operator to do a bind operation along with a scope resolution operation for methods and replace . operator for method access. The resolution algorithm will differ from regular property resolution to cover use cases for both binary and unary form of :: in current proposal but also add value on top of that. There are technical challenges and gotchas in certain corner cases of this approach but a benefit from reduction in cognitive load compared to current proposal should outweigh them.

Term method here is supposed to stand for a function determined by RHS to be executed in context of LHS object. The following kinds of methods can be outlined:

Own method means function stored in property of LHS object intended to be executed in context of LHS object.
Extension method meaning function from lexical scope to be executed in context of LHS object.
Adopted method meaning extension method which is stored in a property of other object.

In short :: operator can be described as method access operator able to capture extension methods from lexical scope.

The rationale

:: operator will be a "this-safe" version of scope resolution operator unlike ..
:: operator will always show the intent of having a bound method.
Using :: instead of . will visually separate method access with intent of invocation from "method as property" access.
Simple semantics and easy to read code. LHS always defines context. RHS always means a method.
Better static analysis support. As RHS of :: will always mean method it would be possible to verify that RHS is invokable before the actual execution of bound function occurs.

Use cases

The following use cases can be defined

Method extraction
```
const log = console::log
```

Method binding


function foo() {}
let bar
let boundFoo = bar::foo //create a bound method for later reuse


3. Extension method
```javascript
function foo() {}
let bar
bar::foo() //execute foo as if it where bar's own method

Method adoption

arrayLike.sort = arrayLike::Array.prototype.sort
//Array.prototype.sort is adopted by arrayLike object and becomes arrayLike's own method

RHS resolution algorithm

RHS resolves to variable in lexical scope if RHS is a valid identifier and is defined.
- Hoisted variables are excluded from resolution.
- Inside module the resolution is limited to module scope
- In global scope the resolution is limited to own properties of global object
- Inside assignment expression LHS of = is excluded from resolution
RHS falls back to property accessor of LHS if RHS is identifier is not found in lexical scope
If RHS is expression it resolves to that expression excluding the last ()

Concerns

Refactoring hazard when newly introduced variable in scope shadows object method. In module scope this can be statically analyzed and produce a warning. In global scope it gets unpredictable.
When in global scope RHS identifier resolution requires runtime checks. In module scope it can be resolved statically.

Examples

Method binding

arrayLike::Array.prototype.sort(comparer)

interprets as

Array.prototype.sort.bind(arrayLike)(comparer)

Method adoption

arrayLike.sort = arrayLike::Array.prototype.sort

interprets as

arrayLike.sort = Array.prototype.sort.bind(arrayLike)

LHS of = exluded from resolution

let log
log = decorate(console::log)

interprets as

let log
log = decorate(console.log.bind(console))

Same with destructuring

let [log, error] = [console::log, console::error]

interprets as

//both log and error resolve to properties
let [log, error] = [console.log.bind(console), console.error.bind(console)]

Nested assignments

let log
let decorated = decorate(log = console::log)

interprets as

let log
let decorated = decorate(log = console.log.bind(console))

Ignoring hoisted variables

let foo = {
 bar() {}
}
let bound = foo::bar
let bar

interprets as

let foo = {
 bar() {}
}
let bound = foo.bar.bind(foo) //hoisted local bar is ignored
let bar //might also be a warning from static analyzer

Property shadowing

let error = function(){}
let [log, printError] = [
  console::log, 
  console::error
]

interprets as

let error = function(){}
let [log, printError] = [
  console.log.bind(console), 
  error.bind(console) //extension method takes precedence over own
  //should cause a warning from static analyzer
]

andyearnshaw commented 7 years ago

The rule of thumb is that local variables and functions always shadow the properties of an object on LHS.

Variables and functions are resolved by scope, so this can't work well without being confusing. For a real world example, take a look at this:

let toString = someObj::toString;

There are several problems with your proposal that come to light here. The first is the temporal dead zone rule for let/const vars; with toString defined using let, the RHS technically resolves local and will throw a ReferenceError. Definitely confusing. Let's say we allow this proposal to continue and the ReferenceError to throw, so we have to change that code to look like this:

let toStringBound = someObj::toString;

Now we've got a bound version of someObj's toString method, right? Wrong. Variables are resolved by scope, so toString will resolve to the global object's toString method, which may not be the same as the one that belongs to someObj.

I agree that the binary form is more intuitive to read, but it can't do the job of the unary form as well as its own job.

bathos commented 7 years ago

I’m curious about what this example is demonstrating:

let foo = {
  bar() {
    return this
  },
  baz() {
    return this
  }
}

foo
  ::bar()
  ::baz()

... since this yields the same result:

foo
  .bar()
  .baz()

Altogether this seems to involve a lot of complex rules. We’re already in fuzzy territory with a proposal that reuses one token :: to mean two things, but it does so in a clearly discernable way (position based). I don’t think the idea of foo::bar() being (1) a valid expression regardless of whether bar is defined and (2) changing its meaning if bar does become defined is tenable. Right now, usage of the binary :: is statically analyzable (it is clearly an error if bar is undefined), and I think there are very good reasons to prefer that to remain the case.

InvictusMB commented 7 years ago

@andyearnshaw Temporal dead zone for let/const vars doesn't play a part here. Please don't split the statements that come together.

The rule of thumb is that local variables and functions always shadow the properties of an object on LHS. With one notable exception for being a RHS of an assignment.

Assignment alters the name resolution of RHS for :: in a way that LHS of assignment is always excluded from resolution. Doesn't matter at all if let/const is there.

It is done with the intent to fallback to property and is covered with detailed examples by point 4. let toString = someObj::toString; will be interpreted without any exceptions raised as let toString = someObj.toString.bind(someObj);

Global object is also excluded from name resolution as expected in strict mode. So let toStringBound = someObj::toString is interpreted as let toStringBound = someObj.toString.bind(someObj) as expected.

Although I could agree that there might be a concern with variable hoisting as I don't know yet how technically feasible is it to discard variable hoisting for name resolution on RHS of :: operator. So that

let boundToString = obj::toString

let toString = function() {}

might be interpreted as

let boundToString = toString.bind(obj)

let toString = function() {}

But the idea is to ignore hoisted variables. Hoisting is not the most intuitive thing either and ReferenceError for let/const is there to actually prevent using hoisted variables. So the expected behavior of

let boundToString = obj::toString

let toString = function() {}

interpreting as

let boundToString = obj.toString.bind(obj)

let toString = function() {}

is quite intuitive despite hoisting.

InvictusMB commented 7 years ago

@bathos Although

let foo = {
  bar() {
    return this
  },
  baz() {
    return this
  }
}

foo
  ::bar()
  ::baz()

might seem equal to

let foo = {
  bar() {
    return this
  },
  baz() {
    return this
  }
}

foo
  .bar()
  .baz()

it actually means

let tmp = (foo.bar.bind(foo))()
(tmp.baz.bind(tmp))()

let tmp = (foo.bar.call(foo))
tmp.baz.call(tmp)

Could you elaborate on why this can be not analyzable statically? An interpreter could probably even do a static analysis and optimize this if possible to

foo
  .bar()
  .baz()

to prevent extra call or bind invocations and creating temporary bound functions.

andyearnshaw commented 7 years ago

I skim-read part 4 of your initial post, I apologise for that.

Global object is also excluded from name resolution as expected in strict mode. I'm not sure what you mean by this. Strict mode doesn't affect global object from name resolution, it only prevents implicitly creating global variables by omitting var/const/let.

despite hoisting is quite intuitive. I disagree.

Your proposed changes are markedly hindering the binary form of the bind operator by introducing odd changes to semantics. Consider:

let foo = {
        bar() { return this; }
    },
    barBound = foo::bar;

Let's say, for argument's sake, that this piece of code is inside the scope of another function on line 2421 of foo.js. Later, we decide to import something from another file and we've forgotten all about the other piece of code:

import bar from 'bar.js';

Your proposal doesn't make it completely clear what happens here. Does the imported bar from the outer scope shadow the method on foo? If so, that's a big refactoring hazard. If not, it's still a big refactoring hazard because imported functions for use with :: are at risk of being shadowed by object methods.

The current proposal doesn't make any changes to identifier resolution, it just takes advantage of it in a really nice way. The way the binary form works right now is pretty much perfect IMO, and we should leave it as it is.

InvictusMB commented 7 years ago

@andyearnshaw I do agree that binary form in current proposal is good. What I do not agree is that unary form in current proposal is good. Apart from my opinion that unary form is ugly and looks like Polish notation it also requires significant mental capacity to read and process. Because you have two versions of the same operator with significantly different behaviors. Binary foo::bar simply binds RHS to LHS. But to tell what unary ::foo.baz.bar does you have to

recognize it's an unary form and it's different from binary
split RHS into parts by separating by dots
recognize the last part as function
conclude that said function is bound to the rest of expression
keep in mind that now you have function bar bound to foo.baz

And now imagine you have 10 of those unary statements in one block of code.

This change significantly reduces that mental burden. You always know that RHS is a function which is executed with LHS as this. LHS is obvious and the only effort you might need to do is figure out what RHS means if it's not obvious by name. This works even better if we consider :: not a bind/call/apply operator but a scope resolution operator which it essentially becomes.

Always using foo::bar doesn't require much brainpower. You know that it's almost the same as foo.bar but bar can be an own method of foo or an extension method. Really simple and easy to reason about.

And I do agree that shadowing is a refactoring hazard. But it's rather a corner case than intended use case. And static analyzers could mitigate that by indicating a smell and/or a warning just as they do with variable shadowing currently.

Your proposed changes are markedly hindering the binary form of the bind operator by introducing odd changes to semantics.

I wouldn't call the resulting semantics odd. Consider the extension methods in C#. . operator there works almost the same way as :: operator would here. It resolves RHS to both member methods or extension methods. The only difference is that when shadowing happens, member methods are known at compile and they take precedence before extension methods. In JS that's not possible therefore extension method which is statically inferrable takes precedence. But in both cases shadowing is a code smell.

Global object is also excluded from name resolution as expected in strict mode. I'm not sure what you mean by this. Strict mode doesn't affect global object from name resolution, it only prevents implicitly creating global variables by omitting var/const/let.

Sorry, my bad. I would still take global scope out of name resolution and limit it to module scope for the sake of static analysis and predictability. Unless the code is executing in global scope in which case the binding and resolution can be done only at run time.

despite hoisting is quite intuitive. I disagree.

despite hoisting ~~is quite intuitive~~.

~~despite hoisting~~ is quite intuitive.

is quite intuitive despite hoisting.

Fixed. Of course hoisting itself is counter intuitive. It was designed for the convenience of interpreter developers and not for mental sanity of application developers. Sorry again for poor wording. Introducing new features to language shouldn't be limited by initial poor design.

Btw, if with current proposal we would have the following code

foo()
  ::bar.baz

Would you read it as an unary or a binary form? How long does it take to figure out?

bathos commented 7 years ago

it actually means

 let tmp = (foo.bar.bind(foo))()
 (tmp.baz.bind(tmp))()

Yes. This distinction is entirely ‘internal’ to the expression though. The :: form would have slightly higher overhead, but the the result is guaranteed to be identical by definition (they are just two different ways to do .call(context), but the :: form adds the extra step of creating a new bound function, which is immediately discarded after invocation, defeating the purpose of bind). In fact, were this to be implemented, I imagine engines would optimize it by not binding, since there would be no way to introspect the difference and it would be faster. (Edit: you actually mentioned this yourself, which makes me further confused about what this example illustrated.)

Could you elaborate on why this can be not analyzable statically?

Yeah. Consider the following module:

const foo = 2;
const bar = function() { return this * 5; };
foo::bar(); // 10

With the bind operator defined as it is in the current proposal, bar must be a declared binding for this to be valid. If we change the module to the following, it is easy to know there is an error:

const foo = 2;
foo::bar(); // reference error

But in your proposal, that cannot be detected as a reference error. By removing bar from the scope, we merely changed the meaning of bar in that expression to mean property access rather than binding reference. As with any property access, we cannot know if bar is a property of foo until runtime.

Static analysis aside, that’s clearly a footgun. By removing the declaration of bar, I changed the fundamental meaning of another expression later on without actually making it invalid. There is no precedent for anything like this in JS.

Also worth noting: the distinction between property literals and binding identifiers is done at the parsing level. Your proposal would require these two productions to be conflated and resolved at runtime. However, they actually have different rules, because property literals may be reserved words and binding identifiers may not. What happens here?

const foo = { default: function() {} };
foo::default();

InvictusMB commented 7 years ago

@bathos I do agree with your points separately but disagree with a whole opinion.

Yes, if we would only have a binary form as in current proposal it would be more straightforward than this suggestion. But we also have an unary form in proposal. And together they are a mess.

With the bind operator defined as it is in the current proposal, bar must be a declared binding for this to be valid.

No, that's not completely true. In your simple example yes. But consider

//bar.js
const bar = {
  baz() {  return this * 5; }
};
//otherfile.js
delete bar.baz;
//myfile.js
const foo = 2;
foo::bar.baz();

Reference error? Yes. Involves property lookup? Yes. Statically analyzable? No. Runtime error. And it's not uncommon to have this kind of RHS

let mapping = arrayLike::Array.prototype.map(mapper)

I do agree that implicit fallback to properties is a footgun. But it's not a bigger footgun than the unary form. ::foo.bar is always a runtime error if bar is not defined or not a function. Only foo part can be verified statically. foo::bar can be a runtime error but at least it clearly states the intent to make bar bound to foo. And you may have foo.bar::baz which is much clearer than ::foo.bar.baz.

If you speak in OOP terms you could say that RHS is always a method to be executed in LHS context, :: would create a bound method for you and a method could be own, borrowed or extension method depending on name resolution. Only in the case when bound method resolves to own method of a context object and it is executed immediately it could be simplified to . notation.

Also worth noting: the distinction between property literals and binding identifiers is done at the parsing level. Your proposal would require these two productions to be conflated and resolved at runtime.

No. In module scope it is totally possible to resolve it statically. Runtime resolution would only be needed when code belongs to global scope.

However, they actually have different rules, because property literals may be reserved words and binding identifiers may not. What happens here?

const foo = { default: function() {} };
foo::default();

As per my version it's simple. default is a reserved word and it can not be resolved as an identifier therefore it's a property of foo.

I will rephrase the initial post to shift the emphasis to intended use cases and outline gotchas found during discussion.

bathos commented 7 years ago

No. In module scope it is totally possible to resolve it statically. Runtime resolution would only be needed when code belongs to global scope.

If it’s needed somewhere, is it not still needed? It isn’t the resolution of the reference that I was talking about there, but the determination of whether the token is an IdentifierReference or an IdentifierName. That seems to imply changes at the lexing level. My understanding is that the bar for such changes is pretty high.

it's not a bigger footgun than the unary form.

Sure it is. In the unary form, the property being absent is just our good old friend, undefined is not a function. That may not be the world’s greatest error message, but everybody know what it means and it’s clear what went wrong. In this proposal, the reference might not become undefined — it might be a different function!

Altogether, while you’ve proposed solutions for lots of the hairy aspects of this, I’m afraid it ends up sounding like a cluster of too many exceptional cases (it does foo unless bar except when baz...). I would suggest that a viable proposal:

makes a minimum of changes to effect the simplest initial form of the feature, which can be then always be extended later according to needs revealed by real world experience
makes no changes to existing behaviors
maintains consistency with existing patterns in the language, even in cases where this may be less than ideal. predictability is really important (cf. PHP :))

I would point to this comment, where it was explained that the pushback from TC39 to date has partly been due to concern about making property access more confusing to people. Since this adds a new concept of "conditional" property access — almost like the deprecated with statement, really — I can’t see it gaining traction in this form.

andyearnshaw commented 7 years ago

@InvictusMB

What I do not agree is that unary form in current proposal is good.

Then, actually, we do agree. I agree with your reasons for why the unary form isn't great. I just don't agree that butchering the binary form is the correct way to fix the problem. In all honesty, I'd rather see the unary form held back and just keep the binary form.

InvictusMB commented 7 years ago

@andyearnshaw I don't like the unary form but I believe the use case for it has more value. Fixing console.log.bind(console) to me seems more important than making bar.bind(foo) more fancy. I would rather have method access with binding than just a shortcut for binding. So if this version appears to be too complex and having alternating behavior for :: is too weird, I would then suggest having :: for method extraction and something like ::@for method extension. And then the following code

function baz(){}

//extension
foo
  ::bar()
  ::@baz()
  ::bang()

//extraction
const boundBar = foo::bar
//binding
const boundBaz = foo::@baz

is still quite expressive and clearly states all the intents.

dead-claudia commented 7 years ago

Highly informative past discussion on this: #26. Note that it's a year old, but it should be a great read.

InvictusMB commented 7 years ago

I will close this in favor of a more fine grained opinion. The outcome of this thread in context of this suggestion I would summarize as

there is a need for context pipelining operator
I consider :: to be suitable candidate because of scope resolution semantics
RHS of :: should resolve to instance methods of LHS
:: should focus only on context pipelining i.e. passing this to RHS
method extension should be a separate concern and a separate operator

Examples of interpreting the pipelining

foo::bar should be interpreted as foo.bar.bind(foo)
foo::bar(a, b, c) should be interpreted as foo.bar.bind(foo)(a, b, c) which can be optimized to foo.bar.call(foo, a, b, c)

foo::bar(a, b)::baz(c, d)::bang(e, f) should be interpreted as

(piped => piped.bang.call(piped, e, f))(
(piped => piped.baz.call(piped, c, d))(
foo.bar.call(foo, a, b)
)
)

or with little help from lodash

_.flow(
piped => piped.bar.call(piped, a, b),
piped => piped.baz.call(piped, c, d),
piped => piped.ban.call(piped, e, f)
)(foo)