Closed rricard closed 2 years ago
@littledan - can we verify if we still want to do this?
Does it mean:
const x = new WeakMap()
x.set(Box(Object), 1)
x.get(Object) === 1
No, Box(Object)
and Object
would have different identities.
const x = new WeakMap()
x.set(Box(Object), 1)
x.set(Object, 2);
x.get(Object) === 2
z.get(Box(Object)) === 1
WeakMap would throw presumably, on any of the unboxed objects being passed in?
Why Box
has it's own identity? Does it mean Box(Object) !== Box(Object)
?
Why
Box
has it's own identity? Does it meanBox(Object) !== Box(Object)
?
Box(x) === Box(x)
(when x === x)
EDIT: The current Box equality spec uses SameValueZero so Box(NaN) === Box(NaN); // true
@acutmore just curious; Box(NaN) === Box(NaN)
?
If you have
let x = {};
Box(x) === Box(x); // true
then you can't make Box usable as a WeakMap key. (Because then putting one in a WeakMap would prevent it from ever being GC'd.)
@bakkot could you clarify why not? If the program can no longer observe x
, it can no longer make a box for x
. Which means there is nothing preventing the implementation from collecting x
and Box(x)
in weak collections.
@acutmore just curious;
Box(NaN) === Box(NaN)
?
Box(x) === Box(x) // (when x === x)
I edited just after I posted to account for that but wasn't quick enough!
EDIT: The current Box equality spec uses SameValueZero so Box(NaN) === Box(NaN); // true
If the program can no longer observe
x
, it can no longer make a box forx
.
But if it can observe x
then it can't ever clean up a Box(x)
used in a WeakMap. (Or Box(Box(x))
, etc.)
(can you make a Box of a primitive, such that Box(Box(x))
could ever happen?)
But if it can observe
x
then it can't ever clean up aBox(x)
used in a WeakMap. (OrBox(Box(x))
, etc.)
I suppose Box(x)
should only be allowed in a WeakMap if x
would be allowed.
I suppose
Box(x)
should only be allowed in a WeakMap ifx
would be allowed.
Even if x
is just a regular object, holding a reference to x
should not prevent values derived from x
from being collected. By analogy, holding a reference to x
does not prevent an engine from collecting [x]
.
(can you make a Box of a primitive, such that
Box(Box(x))
could ever happen?)
I'd argue that making a Box of a Box should not be allowed but simply return the existing Box.
Aka Box(Box(x)) === Box(x)
, the same way Object(x) === x
when x
is an Object.
That's probably a separate issue.
Why
Box
has it's own identity? Does it meanBox(Object) !== Box(Object)
?
Box(x) === Box(x)
(when x === x)
So why Box
has it's own identity (I think identity means Symbol() !== Symbol()
and {} !== {}
)
Even if
x
is just a regular object, holding a reference tox
should not prevent values derived fromx
from being collected. By analogy, holding a reference tox
does not prevent an engine from collecting[x]
.
The difference is that before Box, derived values had their own identity independent of the value they contain. E.g. [x] !== [x]
. For Box where you can recreate a derived value which has the same identity, collecting the weak collection entry keyed on the box would be observable by the program.
But if it can observe
x
then it can't ever clean up aBox(x)
used in a WeakMap.
To be extra clear, an implementation wouldn't be allowed to collect an entry with Box(x)
as key while x
is observable, but once x
is no longer observable, the collection entry can be collected.
That behavior can actually be implemented in userland by having a custom weak collection which checks if the type of the key is a box, and instead operates on a secondary internal collection with the unboxed value used as key.
For Box where you can recreate a derived value which has the same identity, collecting the weak collection entry keyed on the box would be observable by the program.
Right, exactly. Which is precisely why we shouldn't allow Boxes to be used as WeakMap keys.
So why
Box
has it's own identity (I think identity meansSymbol() !== Symbol()
and{} !== {}
)
Box
doesn't have its own identity, it derives its identity from its internal boxed value, like other primitives (except symbols). The difference with other primitives is that the boxed value may itself have an unforgeable identity (unregistered symbol or object).
That behavior can actually be implemented in userland by having a custom weak collection which checks if the type of the key is a box, and instead operates on a secondary internal collection with the unboxed value used as key.
I wrote an implementation of a WeakMap
supporting boxes even without having built-in support, for who is interested.
class WeakMapSupportingBoxes extends WeakMap {
#weakmaps = Object.create(null);
#unwrap(key) {
let boxesCount = 0;
while (typeof key === "box") {
boxesCount++;
key = key.unbox();
}
return { key, boxesCount };
}
has(k) {
const { key, boxesCount } = this.#unwrap(k);
return this.#weakmaps[boxesCount]?.has(key) ?? false;
}
get(k, v) {
const { key, boxesCount } = this.#unwrap(k);
return this.#weakmaps[boxesCount]?.get(key);
}
set(k, v) {
const { key, boxesCount } = this.#unwrap(v);
(this.#weakmaps[boxesCount] ??= new WeakMap).set(key, v);
return this;
}
delete(k, v) {
const { key, boxesCount } = this.#unwrap(v);
return this.#weakmaps[boxesCount]?.delete(key) ?? false;
}
}
I agree that Box(x)
not being collectable when x
is still reachable should not a blocking problem, since it's effectively the same as having two weakmaps both having x
as a key.
In my mind, the interesting part of this idea is allowing Records and Tuples as WeakMap keys, if they include a Box with an object in it. This capability could generalize @bmeck's compound key proposal. A Record or Tuple would be considered live if all of the objects which it references via Boxes are live.
In support of "the interesting part of this idea is allowing Records and Tuples as WeakMap keys, if they include a Box with an object in it.". A user-land WeakMap that supports this can be constructed (proof-of-concept). So it seems friendly to directly support it in the language level WeakMap, so existing code paths would work with compatible R&Ts.
The one issue with this completely replacing compound keys is that you grant mutable access to the boxed components by using a Record/Tuple.
@bmeck, could this be solved by placing a frozen "token" object inside the box, and using a side WeakMap
table to associate the real object to the token? I know the ergonomics aren't good, but would it technically solve the use case?
That just sounds like a normal side table approach to me. Which as I've stated is possible to make work but isn't trivial to code for once you do things like R->R mapping. Why not just use freshly minted Symbols which could then be re-used across tuples:
const lookupRefs = new WeakMap();
function lookupRefFor(v, table = lookupRefs) {
// only change is here in Symbol vs Object
if (!lookupRefs.has(v)) lookupRefs.set(v, Symbol());
return lookupRefs.get(v);
}
function makeRecord(value_a, value_x_y_z, table = lookupRefs) {
return #{
a: lookupRefFor(value_a, table),
x: {
y: { z: lookupRefFor(value_x_y_z, table) }
}
};
}
// the above kind of side table doesn't allow piecemeal replacement
// if the value of r.a === r.b they are not given unique boxes so in this scheme replacing r.a replaces r.b
// a slightly more flexible would give unique boxes per uniquely modifiable location. The problem can be shown:
let foo = {};
let bar = {};
let record1 = makeRecord(foo, foo);
let record2 = makeRecord(foo, bar);
// replacing the table entry of `foo` replaces all *references*
lookup.set(record1.a, 'I replaced record1.x.y.z and record2.a');
I don't think having an Object
vs Symbol
here is of significant difference personally and symbol has the nice effect of preventing some potential gotchas like private field attachment. I will state that I have done Object.freeze(Object.create(null))
in the past to round trip across iframes w/o leaking the Symbol.prototype
but don't think it matters here. Either way, I do think it is possible but less trivial than people might make out without playing with the design pattern and especially w/ different forms of transforming involving the side tables.
If we end up with Box
as a primitive, I was thinking that Box(Object.freeze(Object.create(null)))
would be a nice way to have something semantically equivalent to a Symbol()
without requiring all Symbols as WeakMap keys, and the whole memory leak can of worms. As a new type, we can mandate that only R/T/B that contain a nested object be allowed in weak collections (as long as a built-in predicate exists to test this).
But I now realize I don't understand your use case. Do you have any pointers to a draft proposal or examples?
@mhofman I don't have any simple/reduced cases in a proposal but I can make an example of what kind of things I do across Realms currently, notably the issue above is an example of how stating to just use a side table isn't really clear since there are 2 different but common side table use cases.
const privileged = {writeFile() {/*...*/}}
const unprivileged = {writeFile() {/*...*/}}
const privilegedRefs = new WeakMap();
const unprivilegedRefs = new WeakMap();
const logicalKey = Object.freeze(Object.create(null));
privilegedRefs.set(logicalKey, privileged);
unprivilegedRefs.set(logicalKey, unprivileged);
const record = #{
api: logicalKey,
deep: {nested: {api: logicalKey}}
};
runUnprivileged(record, privilegedRefs);
runPrivileged(record, unprivilegedRefs);
This kind of side table is done by replacing via identity, not location. It is good for avoiding accidental leakage of privileged APIs by guarding based upon identity of the capabilities. In the case above even though we only do 1 assignment to the table it covers both locations in the record.
const onclick = (e) => {event.preventDefault();}
const instanceARefs = new WeakMap();
const instanceBRefs = new WeakMap();
const logicalKeyA = Object.freeze(Object.create(null));
const logicalKeyB = Object.freeze(Object.create(null));
instanceARefs.set(logicalKeyA, onclick);
instanceARefs.set(logicalKeyB, onclick);
instanceBRefs.set(logicalKeyA, onclick);
instanceBRefs.set(logicalKeyB, onclick);
const template = #{
onclick: logicalKeyA,
deep: {nested: {onclick: logicalKeyB}}
};
In this case we actually are using the R/T as a templating system where we aren't trying to guard against all kinds of access to powerful APIs but instead are only having the mutable parts of the R/T replaced. In particular, this means that we don't want to replace all locations containing a specific side table value but want to replace specific locations in the object graph even if other parts of the object graph point to the same side table value. This causes the lookup keys to be based upon location rather than identity.
This one gets particularly nasty if you nest R/T (some pseudocode here):
function makeElem({ children = #[], onclick = () => {} } = {}) {
// this could be done via several abstractions
onclick = getLocationKeyFor(makeElem, 'onclick');
return #{
onclick,
children
};
}
makeElem(
{
onclick: () => { console.log(a); }
children: #[ makeElem({ onclick: console.log(b); } ],
}
)
So instead you have to nest side tables to deal w/ this as the intent is to emulate mutable instances with hardened parts of the object graph and now you basically are re-creating object semantics in user space. This can be useful since you can re-use the R/T without allocations for the hardened parts, but it isn't easy.
Can this be closed now that https://github.com/tc39/proposal-record-tuple/pull/277 is merged?
Follow on to #197
We would like to make this change so this ticket will track completion