tc39 / proposal-bind-operator

This-Binding Syntax for ECMAScript
1.74k stars 30 forks source link

Make binding (mostly) idempotent #46

Open dead-claudia opened 7 years ago

dead-claudia commented 7 years ago

As mentioned previously in multiple places here previously, the following behavior is highly unintuitive:

const obj = {foo() { return this }}
function bar() { return this }

assert(::obj.foo !== ::obj.foo)
assert(obj::bar !== obj::bar)

Instead of trying to ditch it entirely, here's my proposed fix: create a global per-realm 2D weak map to memoize bound functions. It's harder to transpile, but engine support won't have to make significant changes to accommodate this. Why am I suggesting this?

This retains some of the existing benefits of the existing proposal with its unity, while removing one of the key gotchas people have complained about. In addition, it allows better engine optimization, because in practice, the weak map is merely an implementation detail:

So, what do you all think about making it idempotent per-realm?

bathos commented 7 years ago

This comes to mind as something I think is pretty weird, if I’m understanding the proposal correctly...

module-a

export const foo = { bar() { /*...*/ } };
(::foo.bar).qux = 4;

module-b

import { foo } from './module-a';

assert(foo.bar.qux === undefined);

const boundBar = ::foo.bar;

assert(boundBar.qux === 4);

I don’t think there’s any precedent for being able to pull objects out of the ether without having a prior reference like this, is there?

(Also, if they’re GCable, does that mean they are GCable when both the bound and the original function are no longer reffed, or in general? Because in the latter case, boundBar.qux might be 4 or it might be undefined — nondetermistically.)

andyearnshaw commented 7 years ago

@bathos that was my first thought too and I think that would be an important blocker for this idea. I don't see how it can be reliably transpiled either, without introducing a common global variable that all transpilers agreed upon (or accept that it's not really per-realm when you transpile it).

I'd rather see it implemented per scope instead of global/per-realm, e.g.:

module-a

import foo from './module-c';
import qux from './module-b';

qux(::foo.bar);
assert(::foo.bar !== ::foo.bar); // false

module-b

import foo from './module-c';

export default function qux (fn) {
    assert(::foo.bar !== fn); // true
};

Essentially, module-a would transpile/desugar to:

import foo from './module-c';
import qux from './module-b';

const __fooBarBound = foo.bar.bind(foo);
qux(__fooBarBound);
assert(__fooBarBound !== __fooBarBound); // false

I think this could work because it covers the biggest footgun and major use case for memoization, unsubscribe functions. It also gets the same GC benefits and is super easy to transpile to engines that don't support WeakMap.

bathos commented 7 years ago

That makes more sense to me, I think that’s a lot safer and simpler, and realm-level globalness isn’t actually solving anything I think; 99% of the time, the event handler patterns in question would occur within a single function scope.

But my gut still says this is solving a footgun related to obscure semantics with more obscure semantics. I don’t think this is a good idea mainly because it creates such a unique case, a category of behavior which doesn’t fit into any prior pattern in the language. I probably can’t articulate a better position than that, though ... maybe I’m just thinking too conservative.

Volune commented 7 years ago

Two cases I think may not work with the per-scope implementation:

const obj = { a() {}, b() {} );
const a = ::obj.a;
obj.a = obj.b;
const b = ::obj.a;
assert(a !== b);

const obj = { a() {}, b() {} }, boundObj = {};
Object.keys(obj).forEach((key) => {
  if (typeof obj[key] === 'function') {
    boundObj[key] = ::obj[key];
  }
});
Object.keys(obj).forEach((key) => {
  if (typeof obj[key] === 'function') {
    assert(boundObj[key] === ::obj[key]);
  }
});
Alxandr commented 7 years ago

@Volune that depends on how you do the per-scope implementations though. If you do a weakmap of some-kind, the name won't matter. Something like this maybe (not tested)?

// input:
const foo = ::console.log;
// result
const __boundFn = (() => {
  // helper, maybe even in babel-polyfill or whatever
  const cache = new WeakMap();
  return (obj, fn) => {
    let fns;
    if (cache.has(obj)) {
      fns = cache.get(obj);
    } else {
      fns = new WeakMap();
      cache.set(obj, fns);
    }

    if (fns.contains(fn)) {
      return fns.get(fn);
    } else {
      bound = (...args) => fn.call(obj, args);
      fns.set(fn, bound);
      return bound;
    }
  };
})();

const foo = __boundFn(console, console.log);
andyearnshaw commented 7 years ago

Indeed, the variable-style desugaring was an oversimplification. The implementation details would be left up to the vendor. The runtime implications do make it more difficult to transpile without a WeakMap though.

I'm partially in agreement with https://github.com/tc39/proposal-bind-operator/issues/46#issuecomment-282018521, though. I'm not sure if these semantics are worth the extra effort and complexity.

dead-claudia commented 7 years ago

Note that the per-scope thing is purely an optimization. The spec would be in terms of a per-realm 2D weak map, with entries added based on the realm each bound function is created. So basically, these would all work as expected:

Same module:

const foo = {bar() {}}
assert.equal(::foo.bar, ::foo.bar)

Different modules, same realm:

// ./a.js
export const foo = {bar() {}}
export const bound = ::foo.bar

// ./b.js
import {foo, bound} from "./a"
assert.equal(::foo.bar, bound)

Same module, different realm:

const vm = require("vm")

const foo = {bar() {}}
const bound = vm.runInNewContext("::foo.bar", foo)
assert.notEqual(::foo.bar, bound)

Different modules, different realm:

// ./a.js
const vm = require("vm")

export const foo = {bar() {}}
export const bound = vm.runInNewContext("::foo.bar", foo)

// ./b.js
import {foo, bound} from "./a"
assert.notEqual(::foo.bar, bound)

As for what Babel and friends will do, they'll need this helper function. (If only one version of this helper is used, then you've got nearly 100% spec compliance.)

// Instead of calling `func.bind(inst)`, it'll call `bind(func, inst)` from here
var wm = new WeakMap()
function bind(func, inst) {
    // Don't memoize for non-primitive `this`
    if (inst == null || typeof inst !== "object" && typeof inst !== "function") {
        return func.bind(inst)
    }
    var child = wm.get(func)
    if (child == null) wm.set(func, child = new WeakMap())
    var ref = child.get(inst)
    if (ref == null) child.set(inst, ref = func.bind(inst))
    return ref
}

Hopefully that better clarifies what I mean.

dead-claudia commented 7 years ago

In spec language, it'd look like this:

6.1.7.4 Well Known Intrinsic Objects

Add the following values to the list of well known intrinsic objects:

8.2 Realms

Add a [[BoundFunctions]] field, a %WeakMap% instance, to the list of realm record fields.

Note that the [[BoundFunctions]] field, as well as its contents, is never exposed to user code, so an implementation might avoid creating it.

8.2.1 CreateRealm()

Add the following right before the last step:

Note that the [[BoundFunctions]] field, as well as its contents, is never exposed to user code, so an implementation might avoid creating it.

Abstract Operation: InitializeBoundFunction(target, thisValue)

Note that the [[BoundFunctions]] field, as well as its contents, is never exposed to user code, so an implementation might avoid creating it.

InvictusMB commented 7 years ago

@isiahmeadows Why not store a reference to a bound instance of function on the original instance of function itself? Why introduce a global shared state for this?

dead-claudia commented 7 years ago

@InvictusMB You could, but it'd still require a single weak map. I used a global WeakMap for the first layer of indirection as a lazy way to mock up the semantics, but you could add a per-function field.

Either way is transparent to the user, so engines could even use a std::unordered_map on the object if they wanted.

wearhere commented 7 years ago

I'd like to vote for indexing at the top level by object rather than by function, i.e. adopt an implementation closer to https://github.com/tc39/proposal-bind-operator/issues/46#issuecomment-282051644 than the snippet at the bottom of https://github.com/tc39/proposal-bind-operator/issues/46#issuecomment-282509203; or, if folks didn't want to introduce global state, using per-object fields rather than per-function fields.

I suspect this will reduce the number of maps, because—at least when binding functions for use with React—it's common to bind multiple functions to the same object, but less common to bind the same function to multiple objects.

wearhere commented 7 years ago

Hi folks, I have a WIP Babel plugin that forks https://github.com/babel/babel/tree/master/packages/babel-plugin-transform-function-bind to roughly implement https://github.com/tc39/proposal-bind-operator/issues/46#issuecomment-282051644, you can see here. More commentary on the Gist, feedback welcome!

dead-claudia commented 7 years ago

I like that, and it generally looks what I had intended with my suggestion.

acutmore commented 6 years ago

Perhaps the cached function should be frozen (Object.freeze) so the caller can not observe/introduce side-effects made permissible by the idempotence.

I think idempotence is crucial for this operator to work intuitively. Especially for the add/remove eventListener example:

addEventListener('foo', obj::bar);
removeEventListener('foo', obj::bar); // would leak listeners if not idempotence
// ref: https://github.com/tc39/proposal-bind-operator/issues/44#issue-203206592

On the other hand, the points about this being observable to the caller when the function is modified are also very good.

zenparsing commented 6 years ago

It would definitely have to be frozen (kind of like template literal call site objects) in order to satisfy some object-security requirements.

On the other hand, the points about this being observable to the caller when the function is modified are also very good.

You would have to use a WeakMap to store data related to the bound function.

dead-claudia commented 6 years ago

@zenparsing The weak map requirement was stated even in the initial comment of this issue. Just thought I'd point that out.

@acutmore I didn't initially think of idempotence, but that sounds like a good idea. If we also allow this, it also allows engines to create flyweights that are ===, but don't share the same memory address under the hood, so they don't incur the map overhead. (This has precedent in the form of strings - they could just return exotic auto-frozen callable objects.)

const bound1 = obj::bound
obj.bound = () => ...
const bound2 = obj::bound
assert(bound1 !== bound2)