tc39 / proposal-symbols-as-weakmap-keys

Permit Symbols as keys in WeakMaps, entries in WeakSets and WeakRefs, and registered in FinalizationRegistries
http://tc39.es/proposal-symbols-as-weakmap-keys
MIT License
90 stars 7 forks source link

Definition of liveness needs to be updated #19

Open bakkot opened 3 years ago

bakkot commented 3 years ago

This proposal allows having a WeakRef to a Symbol, but the definition of liveness used by WeakRefs only talks about sets of objects. It needs to be updated to handle symbols as well.

leobalter commented 3 years ago

Thanks, @bakkot! I agree with this.

For clarification, is this something I would prefer to be addressed in this repo's draft before Stage 3 or before Stage 4 with the PR for ECMA-262?

I don't mind either, but we surely have this and some small editorial bits in ECMA-262 to account for this change.

We have a draft PR (in progress) and somehow IMO this work feels more valuable being added there rather than in this repo's draft.

bakkot commented 3 years ago

I'd like it addressed before stage 3, ideally. But I'm fine with saying that the PR is the canonical spec text for the purposes of stage 3, if you want to make the change in only one place.

leobalter commented 3 years ago

For now https://github.com/tc39/ecma262/compare/master...leobalter:symbol-as-weakmap-key (the last commit there)

leobalter commented 3 years ago

20 includes the same changes present in the PR to be reflected in the Repo's draft.

mhofman commented 3 years ago

My understanding is that presence of a symbol in the GlobalSymbolRegistry alone doesn't cause the symbol to be kept alive, but instead it's caused by the usage of that symbol in a weak structure (WeakMap key or WeakRef/FinalizationRegistry). That means the liveness observability of a registered symbol is not permanent but tied to the liveness of the weak structure that holds it.

Let me clarify. Today an implementation is free to collect a symbol in the GlobalSymbolRegistry if there are no uses of that symbol in the user's code, because the user would be unable to observe that a future new symbol identity created by Symbol.for does not have the same identity as the previous symbol. Adding symbols as weakmap keys would allow a user to make that observation:

let wm = new WeakMap();
let s1 = Symbol.for('foo');
wm.set(s1, 8);
s1 = null;
// say a full gc happens here
let s2 = Symbol.for('foo');
console.log(wm.get(s2));

(credits @erights for the example)

The implementation can no longer collect the s1 identity during the full gc. However once the wm WeakMap is also no longer reachable from user code, there is no longer a way for the user to make any observation, and the implementation is once again free to collect the registered symbol.

Obviously the liveness of the symbol can also be directly observed through WeakRef/FinalizationRegistry if allowed as target, but only while these WeakRef/FinalizationRegistry are themselves reachable.

From an implementation perspective, I believe what it means is that adding registered symbols as a value in an internal WeakMap keyed on those weak structures (WeakMap, WeakSet, WeakRef, FinalizationRegistry Cell) would be sufficient to keep the ability to collect registered symbols using existing mechanisms without making it observable (I assume the GlobalSymbolRegistry is implemented as a sort of WeakValueMap).

Pseudo JavaScript equivalent:

const registeredSymbolsInUse = new WeakMap();

class WeakMap {
  constructor(...) {
    registeredSymbolsInUse.set(this, new Set());
    ...
  }

  set(key, value) {
    if (typeof key === 'symbol' && Symbol.keyFor(key) !== undefined) {
      registeredSymbolsInUse.get(this).add(key);
    }
    ...
  }

  delete(key) {
    if (typeof key === 'symbol' && Symbol.keyFor(key) !== undefined) {
      registeredSymbolsInUse.get(this).delete(key);
    }
    ...
  }
}

class WeakRef {
  constructor(target) {
    if (typeof target === 'symbol' && Symbol.keyFor(target) !== undefined) {
      registeredSymbolsInUse.set(this, target);
    }
    ...
  }
}

class FinalizationRegistry {
  register(target, heldValue, unregisterToken) {
    const cell = {target, heldValue, unregisterToken};
    if (typeof target === 'symbol' && Symbol.keyFor(target) !== undefined) {
      registeredSymbolsInUse.set(cell, target);
    }
    this.#cells.add(cell);
  }

  unregister(unregisterToken) {
    for (const cell of this.#cells) {
      if (cell.unregisterToken !== unregisterToken) continue;
      this.#cells.delete(cell);
      if (typeof cell.target === 'symbol' && Symbol.keyFor(cell.target) !== undefined) {
        registeredSymbolsInUse.delete(cell);
      }
    }
  }
}

cc @syg @waldemarhorwat

nicolo-ribaudo commented 2 years ago

@mhofman If the difference between "registered symbol is never collected" and "registered symbol is collected only if it's not referenced neither strongly nor weakly" observable in any way?

ExE-Boss commented 2 years ago

Is the difference between "registered symbol is never collected" and "registered symbol is collected only if it's not referenced neither strongly nor weakly" observable in any way?

As far as I understand it, the difference is entirely unobservable.

mhofman commented 2 years ago

Correct, there is no observable difference. That's the point I was trying to make. From what I remember, one concern I heard at the May 2021 plenary was that allowing registered symbols as weak keys meant they would no longer be collectible. My clarification above was that only the registered symbols that are actively used as Weakmap key / WeakRef or FinalizationRegistry target (aka observed) would be prevented from collection. Now of course this is still a concern as the program usually keeps that observer around, leaking memory.

erights commented 2 years ago

From the phrase "Weakmap key / WeakRef or FinalizationRegistry target (aka observed)" I'm concerned that this may be misunderstood. Even without WeakRefs or FinalizationRegistry, merely using them as a WeakMap key or putting them in a WeakSet makes their collection observable:

const wm = new WeakMap();
let k = Symbol.for('k');
wm.set(k, 'hello');
k = null;
// gc possibly happens. Don't even need a turn boundary
// Nothing holds onto k except wm, which holds it "weakly"
// If that means it could collect the association, then:
console.log(wm.has(Symbol.for('k') ? 'k kept' : 'k dropped');
caridy commented 2 years ago

@nicolo-ribaudo yes, it is observable. @erights examples is pretty clear! As discussed during the last meeting, a while back we arrived to a conclusion (or should I say a common ground) that this behavior would be very bizarre and unexpected for some. An argument in favor of just adding any new complexity for registered symbols, even if that implies never collecting them or they value they are referencing to in a weakmap. At least that's what I remember about that conversation :)

mhofman commented 2 years ago

For what it's worth, I'm now of the opinion that registered symbols should not be allowed as weak key/target, on the ground that records/tuples that do not contain symbols will similarly not be usable as weak key/target, while having the same typeof value. That means a predicate not based on typeof is necessary to test weak usage anyway, so might as well not introduce leaks or complexity.

ljharb commented 2 years ago

I continue to hold my position that all types of symbols must be accepted or rejected consistently. I'd prefer Records and Tuples not have this kind of inconsistent behavior, but I'm willing to entertain it because R&T are a new thing and thus would have never not been weird in this way.

Jack-Works commented 2 years ago

I continue to hold my position that all types of symbols must be accepted or rejected consistently.

I agree. There is no harm in allowing registered symbols. Memory leaking is not a big problem because the developers choose to do it. They can also leak memory in many different ways.

mhofman commented 2 years ago

R&T are a new thing and thus would have never not been weird in this way.

But Symbol usage as weak key is also new, so putting restrictions from the beginning on the kind of symbols would not be weird.

I'd go as far as claim there is (weak) precedent already for this with null which has typeof null === 'object' yet isn't allowed as a weak key/target.

There is no harm in allowing registered symbols. Memory leaking is not a big problem because the developers choose to do it.

Memory leaks is a big problem, and we shouldn't allow something that willfully leaks. If the developer wants to use registered symbols in a WeakMap, they can build their own abstraction with a Map. Using a registered symbol as a weak key/target has no legitimate use cases as by definition those cannot be weak (they are forgeable values). By the same reasoning we should allow all records and tuples? Then why not allow strings and numbers as well?

My point is, let's not consider that typeof is the right test for weak key/target usage. If we're concerned about ergonomics, let's introduce a standard predicate to test this.

ljharb commented 2 years ago

That null's typeof lies does not set any precedent for anything.

We already allow things that willfully leak, all the time. Not everyone who uses weak collections is concerned with collecting the memory held by every value.

I have use cases for this - specifically, I want to write code that is maximally weak, and I do not actually care if any arbitrary key is collectible or not.

typeof is not the right test, obviously, since you'd have to do x && (typeof x === 'object || typeof x === 'function'). The current "right test" is Object(x) === x. If we want a better test, we'd need to make a predicate.

The test being used is irrelevant here, though, because "forgeability" isn't something developers do, or should almost ever need to, think about when using symbols. Sometimes you want a coordination point that's easily obtainable; sometimes you want one you have to explicitly hand out.

bakkot commented 2 years ago

This seems to have gotten off topic - this issue is for an editorial issue with the definition of liveness. If you want to debate whether / how Symbol.for symbols are allowed, open a new issue (or pick up the existing discussion in https://github.com/tc39/ecma262/issues/1194).

acutmore commented 2 years ago

There is a new updated preview of the full changes this proposal...proposes - including updating the definition of liveness

PR: https://github.com/tc39/ecma262/pull/2777 Rendered preview: https://arai-a.github.io/ecma262-compare/?pr=2777