Closed ljharb closed 1 year ago
Something I noticed is with the (existing ECMA262) spec text that says:
A reference such as %name.a.b% means, as if the "b" property of the "a" property of the intrinsic object %name% was accessed prior to any ECMAScript code being evaluated.
One thing not clear about this is what to do about getters/setters, currently the spec does specify a number of getters such as Symbol.prototype.description
or RegExp.prototype.flags
. There's currently no setters but it would best to be futureproof in case any are added.
There's also __proto__
, but assuming we handle getters/setters appropriately (e.g. by just returning an object with get
/set
functions as properties or something like that), then getIntrinsic("Object.prototype.__proto__")
should behave like any other getter/setter (in environments that implement __proto__
at all).
Oh also we need to consider what happens with values that are in a prototype chain, e.g. does getIntrinsic("Array.hasOwnProperty")
work? The current spec text is too vague.
By allowing it we allow potentially a lot of ways to access a given intrinsic, this may or may not be desirable.
This is particularly tricky with intrinsics that may be exposed in new ways over time, for example currently there is %IteratorPrototype%
, but the iterator helpers proposal will add %Iterator%
directly, which means we'd expect getIntrinsic("Iterator.prototype") === getIntrinsic("IteratorPrototype")
.
Because of this sort've thing, we might want to cleanup the intrinsic table a bit to be more consistent, not sure what would be best to do here but perhaps something like introducing %ConstructorPrototype%
for each %Constructor%
(the converse would not neccessarily exist, e.g. there's still no %StringIterator%
even though there is %StringIteratorPrototype%
).
On Symbol methods, the intention is definitely to allow something like [Symbol.iterator]
; this will also have to be thought through, since the only convention in the spec for it is eg %String.prototype[@@iterator]%
, and the double-at isn’t something that should be normative.
For getters, indeed we’ll need to pull out the getter function automatically, thanks for calling that out.
Values in a prototype chain do not work, because Array.hasOwnProperty
is not an intrinsic.
The “FooPrototype” intrinsic notation is legacy and will not be supported by this API.
The “FooPrototype” intrinsic notation is legacy and will not be supported by this API.
This sounds fine to me, but what should be done with APIs that have no corresponding global "constructor"?
For example while %String.prototype[@@iterator]%
can be used to refer to %StringIteratorPrototype%
, how would we refer to something like %IteratorPrototype%
today? Would we just allow %Iterator.prototype%
even though Iterator
global doesn't exist (yet)?
Although Iterator
/AsyncIterator
are currently the only public cases, the iterator helper proposal actually exposes AsyncFromSyncIteratorPrototype
objects properly to user code (e.g. in AsyncIterator.from
it can return an AsyncFromSyncIteratorPrototype
inheriting object) so the hazard would still exist.
That's a good point. Prior to Iterator
being a global, indeed we'd need to support IteratorPrototype
.
Looks like I'll need to audit the entire list of intrinsics and figure out what makes sense to support or not.
All the lengths and names, we don’t need to worry about.
That makes sense, I figured they wouldn’t be included. The list is showing every distinct path except for those which, if followed further, would cycle infinitely (in which case it only expands the shortest+alphabetically earliest). I didn’t know if primitive-value data properties would be considered addressable or if they would be but only selectively:
// Many of the properties are not references to objects, e.g. every function can
// reach "name" and "length". Perhaps these would be omitted, but note that if
// only objects are addressable, values of properties like Math.LOG2E and
// Int8Array.BYTES_PER_ELEMENT wouldn’t be included. A primitive value cannot be
// “intrinsic” per se, but the _property_ is, so this might be surprising.
Properties of top-level namespaces are all included, but otherwise i don’t think any intrinsics are non-objects.
We’d also not traverse through any .constructor
or .__proto__
properties (and probably wouldn’t support dunder proto at all)
Cool — I assumed no __proto__
traversal, but more broadly I assumed there’d be no “accessing accessors” with Get() at all (i.e. you can get the getter or setter, but cannot execute the getter). Most of those would throw TypeErrors anyway.
Is it possible/desirable for there to just be a finite set of “addresses”? It seems there are currently around ~600 unique intrinsic objects. With traversing constructor
(and __proto__
) out, so are the great majority of possible graph reentries. Even if objects/values don’t have a single “canonical” name (cause of e.g. values vs Symbol.iterator), it seems you can still end up with a static “collection” that doesn’t require actual parsing because there’s only so many possible keys (w/ most being 1:1).
I'd prefer to find an algorithmic way of describing the list, rather than hardcoding it.
That makes sense — but afaict, an algorithmic explanation for what addresses exist can still produce a finite set of possible keys, obviating the need for (an implementation) to really parse & interpret the input as a DSL (even though it would act like one).
So the rules perhaps could be:
Traversal is:
Any object not accessed through "constructor" can be traversed into.
Thoughts?
Sounds great! Later I’ll try adjusting the traversal script I used to verify that it still reaches all of the same objects as it did with the more naive pass (modulo non-standard members) when those rules are (effectively) in play — but I’m pretty sure what you described does the trick. The resulting set of address keys should be both intuitive & “incidentally” finite.
For reference, Endo/SES has some logic to crawl the global and remove anything that isn't in an allowlist which takes a similar form to a registry of intrinsics.
The spec text here needs to be better; I'm not sure how yet.