tc39 / ecma262

Status, process, and documents for ECMA-262
https://tc39.es/ecma262/
Other
15.03k stars 1.28k forks source link

Typed array constructors can't be proxied (because they are unnecessarily weird) #163

Closed rossberg closed 8 years ago

rossberg commented 8 years ago

Consider

let P = new Proxy(Int8Array)
let a = new P

You might not have guessed, but this will throw a type error. The reason is the, um, creative way in which the %TypedArray% base constructor tries to figure out what typed array instance to create, namely, by re-traversing the prototype chain from newTarget up to itself, looking for the closest [[TypedArrayConstructorName]] internal slot -- but that will never hit the Int8Array constructor in the example above, because the algorithm (22.2.1.2.1 AllocateTypedArray) skips from the proxy to its target's prototype directly.

More generally, I argue that this semantics is fundamentally bogus. It doesn't just break proxying typed array constructors, the prototype walk also is observable through proxies in other places of the prototype chain -- which might change the very prototype chain through side effects while you go.

More importantly, it breaks a highly desirable invariant for object construction: preferably, constructing an object should only be keyed on two inputs: (1) the base constructor (for internal fields), and (2) new target (for [[Prototype]]). With the current walk, it is keyed on an arbitrarily large set of objects, plus potentially random computations. That makes optimisation much harder: even to merely decide whether we can take a fast path for construction, we already have to look at the entire prototype chain. Under this semantics, instantiating (subclassed) typed arrays is inherently penalised.

So, as we all know, prototype chain walks are a bad and utterly unreliable tool for figuring out instance relations in JavaScript. The spec should better avoid relying on it for its innards. :)

Fortunately, we could avoid it easily for typed arrays. I propose that instead of having the base constructor re-walk the prototype chain, it should take the derived typed array constructor as an argument and read the [[TypedArrayConstructorName]] slot from there. Then proxying would work just fine, and the aforementioned invariants would be maintained. It’s also simpler. While this technically is a breaking change, it is unlikely that code in the wild currently invokes the %TypedArray% base constructor directly (in part because not all browsers implement it yet), so the compatibility hazard seems neglectable.

allenwb commented 8 years ago

First off, it shouldn't be surprising that creating a Proxy with default handlers on any object breaks. By now, we should all understand that default proxies are not transparent forwarders and in particular object that internally depends upon private state or identify are going to be broken by such proxies. If you want transparent forwarding you need to define appropriate handlers to turn the Proxy into a membrane.

I also don't agree with your assertions about "goodness" and "badness" of walking the [[Prototype]] chain. If you are doing structural subclassing to share private implementation details (and that is what we are doing here) you are indeed dependent upon the integrity of the [[Prototype]] chain. Heck, you had to do super constructor calls up the [[Prototype]] chain in order to access the base constructor. While is is possible for the [[GetPrototypeOf]] call in the second walk to follow a different path, why do you care. As specified it is an error if it doesn't lead to one of the built-in typed array constructors (any object with a [[TypedArrayConstructorName]] internal slot. I would also expect you to special case the most common cases (the newTarget is one of the built-ins or a non-proxy immediate subclass of one of the built-ins).

Also note that what is happening here is essentially the same as a this down call in an inherited method -- ia very common OO practice. The "inherited" constructor is essentially doing a this.getElementTypeName() (where this is the original constructor) except that constructor invocation requires the use of newTarget instead of this and we are using a private internal slots to preclude external tampering.

That said, your alternative design isn't bad. But the overloading of typed array constructors does lead to some complications. (The primary reason, for putting the typed array constructor logic on %TypedArray% was so that the overload processing could be specified (and implemented) in one place, rather than being replicated in each concrete typed array constructor.) So. oif we add an extra arguments to the %TypedArray% constructor each of the concrete subclasses would need to look something like this ES level equivalent:

class Int8Array extends $TypedArray$ {
   constructor(...args) {return super(Int8Array, ...args)} //over-load resolution applied to ...args
}

(or the first argument to %TypedArray% could be any token (eg, a string) that it can interpret as a element type selector).

I guess, my main dislike of this, is that it means that the signature of %TypedArray% can't used to directly used to describe the signatures of all of its concrete built-in subclass constructors.

B0ttom line, while there are other was the behavior could have been specified I do see any problem that requires making a change at this point. If this was 6+ months prior to ES6 publication then maybe I would be more easily convinced.

Allen

rossberg commented 8 years ago

@allenwb, regarding your first point, you cannot even provide (just) a custom handler to fix this. You are actually forced to create an auxiliary intermediate object to use as target, just for the purpose of satisfying this very specific proto walk.

I would also respectfully disagree that this kind of proto walk is normal OO practice. Unlike usual dispatch, this looks for the last, not the first occurrence (more specifically, the last in a certain range of the inheritance chain, which is even more odd).

In my mind, the fact that you already had to walk the prototype chain for the super calls is very much an argument not for but against doing it again. In particular, since the chain might have changed in the meantime, leading to real odd corner cases that I'd very much like to avoid (not least because it makes optimising this unnecessarily difficult -- I do care because the VM has no choice but to care; sure, you can introduce half a dozen more special cases to cover the common case, but please.).

As for this being a late change, I agree. But we all know that the ES6 process was broken and we had to buy into a lot of completely untested semantics whose implications nobody could fully oversee. As long as something is not web reality (which in this case it isn't), I think we should allow ourselves the liberty to tweak certain design decisions. Especially in a very technical case like this.

I see your point regarding the signature of %TypedArray%, but I think that is a fairly minor disadvantage. It's not like many people will ever program against %TypedArray% directly anyways.

allenwb commented 8 years ago

regarding your first point, you cannot even provide (just) a custom handler to fix this...

Something like this should work

function makeConstructorProxy(target, handlers) {
   let p;
   if (!handlers.construct) {
      handers.construct = function (htarget, args, newTarget) {
         if (newTarget === p) newTarget = target;
         return Reflect.construct(htarget, args, newTarget);
      }
   }
   return p=new Proxy(target, handlers);
}

Unlike usual dispatch, this looks for the last, not the first occurrence

I guess I don't understand what you are saying here. In constructors (which don't take a this argument) newTarget serves the role of this. The second walk you are talking about is the moral equivalent of doing:

constructorName = this.typedArrayConstructorName;  //actually newTarget instead of this

which is a completely normal OO thing to do. The only difference is that in the spec. we do an explicit lookup for a tamper-proof internal slot. Dynamically changing the [[Prototype]] chain of objects that have inflight method calls is a bad thing to do. But that possibility isn't unique to the %TypedArray% constructor.

Allen

rossberg commented 8 years ago

regarding your first point, you cannot even provide (just) a custom handler to fix this...

Something like this should work

function makeConstructorProxy(target, handlers) { let p; if (!handlers.construct) { handers.construct = function (htarget, args, newTarget) { if (newTarget === p) newTarget = target; return Reflect.construct(htarget, args, newTarget); } } return p=new Proxy(target, handlers); }

That clearly breaks inheritance:

const PInt8Array = makeConstructorProxy(Int8Array, {})
class MyInt8Array extends PInt8Array { m() {} }
let a = new MyInt8Array(10)
a.m()  // TypeError

I don't think there is any solution that does not involve creating an auxiliary target object.

Unlike usual dispatch, this looks for the last, not the first occurrence

I guess I don't understand what you are saying here. In constructors (which don't take a this argument) newTarget serves the role of this. The second walk you are talking about is the moral equivalent of doing:

constructorName = this.typedArrayConstructorName; //actually newTarget instead of this

which is a completely normal OO thing to do. The only difference is that in the spec. we do an explicit lookup for a tamper-proof internal slot.

Searching for internal slots on the prototype chain is unlike any other use of internal slots in ES -- no others are "inherited", AFAICT. And this sort of (class-side) inheritance of internal slots is exactly what's breaking proxies, because proxies and private inheritance don't interact well.

Also, IMHO, this semantics does not model tamper-proof subclassing of %TypedArray% properly. If I did

let a = Reflect.construct(Int16Array, [10], Int32Array)

then a would be an Int32Array instance, although I was invoking the Int16Array constructor. IOW, I can break the inheritance between %TypedArray% and their immediate subclasses -- even if both objects were frozen! (You can get similar effects without Reflect by tampering with a(nother) typed array's [[Prototype]] slot.)

Dynamically changing the [[Prototype]] chain of objects that have inflight method calls is a bad thing to do. But that possibility isn't unique to the %TypedArray% constructor.

I agree. However, it is made worse by a semantics that walks the chain twice, because that introduces a whole new degree of possible incoherence that the semantics and the VM has to worry about.

littledan commented 8 years ago

Resolution at TC39 (suggestion of @allenwb): Don't walk the chain, instead make an internal algorithm to do the construction, and call it from each TypedArray constructor directly, rather than having a super constructor and proto chain walk. Pull requests welcome.

bterlson commented 8 years ago

This seems good to me. Hopefully @rossberg-chromium agrees :)

rossberg commented 8 years ago

Sounds good! Does that mean we eliminate the %TypedArray% constructor entirely? If so, sounds good, too.

littledan commented 8 years ago

We didn't discuss that detail in the meeting. Would it have anything left to do if it stayed?

No one in the meeting signed up to actually write the pull request in the spec. @rossberg-chromium if you write it, that'll give a good starting point for the final shape on questions like this that we all could refine from there.

allenwb commented 8 years ago

%TypedArray% still needs to be there as it is the holder of the static methods that are common to all the concrete Typed Array classes.

%TypedArray% is essentially an abstract class. Strictly speaking it doesn't have to be a function object, but I would leave it as such so that a self hosted implementation can still use a class declaration to define it.

Note that future private-state support may require subclasses to super() through such abstract constructors in order to allocate subclass instances. Also we still don't want people directly calling or newing %TypedArray%. So, given all that, I would define the %TypedArray% function body very similarly to the definition current given for the individual TypedArray constructors. The only difference is I would add an extra step after step 2 that prevents creating direct instances of %TypedArray%. That step would be:

2.5 If SameValue(NewTarget, here) is true, throw a TypeError exception.

Finally, there is going to have to be quite a bit of editorial restructuring if the distinct over-loads currently defined for %TypedArray% are moved into a single abstract operation. I'm concerned that it could make it harder for a spec. reader to see the nature of the overloads. We should probably get Brian involved to think about how to best present the new structure.

littledan commented 8 years ago

@allenwb How should the %TypedArray% constructor resolve the original issue related to the prototype chain walk that started the thread, if we keep nontrivial behavior in it?

littledan commented 8 years ago

Or, are you saying, it should be a trivial constructor that does nothing, basically what happens in a class with no defined constructor

allenwb commented 8 years ago

The latter, a trivial constructor. Except that it doesn't support direct new invocation (which is what the 2.5 step would filter out).

But note that the actual concrete TypedArray constructors (Int32Array, etc.) won't super call it. Instead they will call out to a new abstract operation that does all the work of the current %TypedArray% constructor specification

littledan commented 8 years ago

Sounds good to me.

rossberg commented 8 years ago

SGTM too.

littledan commented 8 years ago

Unless someone else wants to do it, I'll sign up to draft the new spec text.