Closed brad4d closed 2 years ago
The proposal will not advance with that restriction; either all symbols or none can be WeakMap keys.
I also don't see the issue here. A registered symbol can, in general, be "freed" when it's unobservable. However, by putting it in a WeakMap, it can never be freed until that WeakMap is - a registered symbol that is weakly held remains "live" as long as the thing holding it is live. Why is this a problem?
the GlobalSymbolRegistry entry defined in the spec must keep the Symbol.for() keys alive forever, making them effectively the same as well-known symbols that are stored on global objects. This is incorrect.
In other words, if you s/alive forever/observably alive forever
, it seems correct to me, and allowing them to be WeakMap keys doesn't change this - it just adds a new way for them their liveness to be observable.
I also don't see the issue here. A registered symbol can, in general, be "freed" when it's unobservable. However, by putting it in a WeakMap, it can never be freed until that WeakMap is - a registered symbol that is weakly held remains "live" as long as the thing holding it is live. Why is this a problem?
The point of a WeakMap
is things used as keys in it are not supposed to count toward keeping them alive.
Isn't it a contradiction, then, for a symbol to be kept alive because it is used as a WeakMap
key?
The proposal will not advance with that restriction; either all symbols or none can be WeakMap keys.
This is a pretty strong stance. I would hope there was still a way to arrive at some kind of consensus here.
The key misunderstanding many of us have had revolves around the assumption that the
GlobalSymbolRegistry
entry defined in the spec must keep theSymbol.for()
keys alive forever
As @ljharb mentions, the key bit is not that GlobalSymbolRegistry
exists in the spec, rather that the registered symbol is observed through a WeakMap
. As I explained in https://github.com/tc39/proposal-symbols-as-weakmap-keys/issues/19#issuecomment-848368824, the implication is that a registered symbol cannot be collected for as long as anything observing its liveness is itself live. Some believe that this could be a common mistake by programs that would not test for the type of symbols if not prevented by the language, and would result in preventable memory leaks, similar if we allowed string or numbers to be used as weakmap keys.
I personally see no reason to allow forgeable values as weak keys/targets, besides allowing lazy programs to shoot themselves in the foot.
@brad4d no, because a registry symbol can never observably be freed, ever, already. The spec fiction is there to ensure that. In other words, it would only be in an implementation that attempted to walk the tightrope between freeing them, and keeping that unobservable, that this new difficulty would emerge.
Btw, there is another reason why registered symbols being disallowed as weak keys would make sense: implementation complexity.
Since GlobalSymbolRegistry
is a spec fiction, an implementation can currently trivially implement registered symbols as a sort of composite value wrapping the underlying string or number description, and compare those symbols by value instead of by unique identity like non-registered symbols. Symbol.for
then simply forges a new value from the description, and Symbol.keyFor
simply unwraps the underlying description value.
If allowing registered symbols as WeakMap keys, such an implementation would likely be forced to change their implementation and required to generate a unique identity for registered symbols, akin to forcing ropes.
Thanks @mhofman for that succinct example.
This is in fact the point WH was trying to make, I think. Implementations like that one are supposed to be possible for efficiency reasons.
They truly put symbols created with Symbol.for
on the level with numbers, strings, booleans, etc.
You can never really "clean up" these values, because the exact same value could always be constructed again later.
This is why they are disallowed as WeakMap
keys.
Allowing non-registered Symbols is OK because they can only ever be created once, so they can be safely cleaned up when no non-WeakMap-key references to them remain.
I don't understand why such an implementation would need to change everything, as opposed to making a special carve-out for the subset of registered symbols that are weakly held.
Separately, do any such implementations exist that could speak to these difficulties? If not, then if the spec was designed to allow a kind of implementation, but in practice there's no concrete evidence that there's any benefit to doing so, why preserve it?
I don't understand why such an implementation would need to change everything, as opposed to making a special carve-out for the subset of registered symbols that are weakly held.
By special carve-out, I suppose you mean branch on the type of symbol when creating a WeakRef or WeakMap entry? That feels like significant complexity added everywhere registered symbols could be used.
Separately, do any such implementations exist that could speak to these difficulties?
I'm not sure, but @erights mentioned that's how he would have expected engines to implement registered symbols.
@mhofman if we went with your suggestion of differentiating symbols between registered/well-known and "other", they'd already have to do that - so that complexity seems to be acceptable.
They would only have to do that when adding the entry to the weakmap / creating the weak ref, and throw a TypeError. Here you're talking about having a parallel Map / Ref implementation for registered symbols keys/targets.
The proposal will not advance with that restriction; either all symbols or none can be WeakMap keys
I think disallowing registered symbols is a small wart that allows for a much more valuable use case to work, and so we shouldn't block just because of that. Once implementers get actual experience with how complex symbol GC becomes, we can try reopening to allow registered symbols later on.
@jridgewell that's an opinion i don't share.
That feels like significant complexity added everywhere registered symbols could be used.
Additional methods that would need* to branch for registered symbols if they were allowed:
The branch being to switch the logic to the separate internal Strong{Set,Map} used for registered symbols.
Not being a member of a JS engine implementation team, I will leave the measure of added complexity to others.
* In contrast: If registered symbols are not allowed, the above methods would not necessarily need to branch specifically for registered symbols. The existing code path for handling symbols could be used as the result would be the same. Though implementations may still add code to specially handle registered symbols as an optimisation (early return).
** EDIT: an implementation could store the registered symbols in the usual weakMap's internal structure. But also store hold registered-symbol strongly to avoid any optimisation that would otherwise collect it. This means that the has
and get
methods would not need to branch for registered-symbols and could look them up in the same way as other values.
https://gist.github.com/acutmore/8f16a8d4cf4b7b54c5956bd6eab05cfd
Isn't it a contradiction, then, for a symbol to be kept alive because it is used as a
WeakMap
key
Perhaps the developer experience (in the case of allowing registered symbols) could be improved with clear warnings? For example on MDN and maybe even at runtime with implementations printing a console.warn
. To increase the probability that a JS developer would know that putting a registered symbol in a WeakMap could actually increase memory usage due to it disabling a memory saving optimisation that some engines employ.
( I'm not advocating a position, but trying to help find common ground )
Separately, do any such implementations exist that could speak to these difficulties @ljharb
Yes, Firefox (SpiderMonkey) collects symbols created by Symbol.for
if they are no longer observable. (At least from my testing just now in Firefox nightly)
I'd love to hear a FF contributor's take on the difficulty of knowing that registered Symbols that are weakly held remain observable while the weak container is.
I think the question was also, do any implementation currently implement registered symbols as a wrapped description value without giving them a unique identity (aka doesn't use any kind of map / hash table, and doesn't implement the spec fiction of the global registry).
That was the first half :-) the second half is speaking to the difficulties.
Taking a step back however, the implementation complexity, while important, shouldn't be the driving factor.
I still remain confused by the motivation to allow forgeable values in weak collections. What kind of program would want to legitimately do this? How are registered symbols, or tuples/records not containing symbols, any different from forgeable primitives like string and number? If a program does have a use case for forgeable values, why can't they implement the logic themselves combining a Map
and a WeakMap
?
One use case is that I want a maximally weak collection. I don't particularly care if any of the values are weakly or strongly held - i just want to weakly hold them when possible. The more things I can put in the WeakMap, the simpler my wrapper class can be, and the more things i can delegate to the implementation.
You're totally right that I can implement this already by wrapping a Map and a WeakMap - however, this means borrowed prototype methods fail (or, if i'm clever, one set of borrowed methods fails and the other works, depending on the key type), and this also means that even non-registered non-well-known Symbols will be strongly held.
In other words, I value the additional optimistic weakness that "all symbols can be weakly held" provides, and since GC is already nondeterministic, (short of specific implementation knowledge) nobody can rely on "anything that can be weakly held will be collected", so there's nothing lost by this approach.
Concerns about memory leaks are inevitable because the language spec already does not force collection. The language only discusses observability - and observability is all one can rely on. Collectibility is an emergent property an implementation may choose to provide, but it is not one that language practitioners may safely ever rely on.
One use case is that I want a maximally weak collection. I don't particularly care if any of the values are weakly or strongly held - i just want to weakly hold them when possible
Does this apply to string and numbers as well? If not, what makes symbols different?
Also, what purpose would such a collection have in a program?
and this also means that even non-registered non-well-known Symbols will be strongly held.
How so? Why can't you put them in the WeakMap
section of the abstraction? I do agree a try
/catch
is cumbersome and why I think we should offer a built-in predicate to test "weak-hold-ability".
borrowed prototype methods fail
As a userland abstraction, why would it be necessary? Aka, why can't you use the prototype methods of your new abstraction.
Concerns about memory leaks are inevitable because the language spec already does not force collection. The language only discusses observability - and observability is all one can rely on. Collectibility is an emergent property an implementation may choose to provide, but it is not one that language practitioners may safely ever rely on.
I think we should be realistic, and build APIs for real world programs and machine. By your reasoning, WeakMap and WeakRef have no place in the language in the first place.
Sure, a program should not rely on garbage collection, but without garbage collection, programs would very likely run out of memory, because unlike what the spec pretends, machines do have a finite amount of memory.
Memory leaks are a very real issue in programs, and they're usually pretty obscure. While I wouldn't consider adding a forgeable value to a weak collection an obscure source of leak, I'm not sure the average developer would be able to realize the problem. Preventing the obvious leak and forcing them to find a workaround if that's the behavior they actually need is what I'm after here.
One use case is that I want a maximally weak collection. I don't particularly care if any of the values are weakly or strongly held - i just want to weakly hold them when possible.
I checked a few other languages (below) to see if they offered such a collection in their standard library and they did not seem to.
By my reasoning, WeakMap, WeakSet, and WeakRef's place in the language is to be optimistically weak - meaning, not providing guarantees, just conveying intentions to the engine.
I'm not sure the average developer would be able to realize the problem
The average developer won't be using either registered symbols or any Weak thing, both of which are incredibly obscure.
I checked a few other languages
To me, that's a potential argument against any symbols being allowed; it's not an argument against registered symbols being allowed.
I think this proposal is unrelated to "optimistically weak" collections: I agree that it would be nice to have it in the language (I had to implement it by myself multiple times), but it's not just about registered symbols.
An OptimisticallyWeakMap
(or just relaxing WeakMap
's restrictions) should probably be presented as a separate proposal (either competing with this one, or as an extension), because it asks for much more than what this is asking (e.g. allowing numbers, strings, ...) and can have concerns that are different from the ones raised for this proposal (for example, "unexpected memory leaks" does not hold if it's a class with a new explicit name).
By my reasoning, WeakMap, WeakSet, and WeakRef's place in the language is to be optimistically weak - meaning, not providing guarantees, just conveying intentions to the engine.
I think what you're looking for is a new type of collection, which would accept any primitive, including strings and numbers. I don't see why registered symbols should be held to a different standard.
To me, that's a potential argument against any symbols being allowed; it's not an argument against registered symbols being allowed.
But do other languages have a notion of symbols though? Aka non-object values which may be unique and unforgeable? (genuine question)
I think we're overindexing on my particular use case.
Regardless, I think creating a bifurcation in the category of "Symbols" is an unacceptably confusing thing to add to the language, and I remain unswayed by arguments pertaining to collectibility, which is something the language does not guarantee.
The bifurcation needs to happen somewhere. A program which wants to be responsible and not leak memory needs to be able to know what it can put in a WeakMap without causing a leak.
Requiring the program to use heuristics like calling Symbol.keyFor
is not future proof. It also would be extremely onerous when combined with records and tuples: imagine if a program had to walk every value held in those, and test if it's a symbol, but not a registered symbol.
I maintain that we need a predicate offered by the language to perform this test. That predicate needs to implement the same check that I'm advocating should prevent the addition to weak collection.
The use cases for this proposal are for patterns where unique symbols are created and then stored in a WeakMap. Not for taking symbols from an unknown source.
In what use cases do we think this concern about registered symbols would come up in practice?
As someone who has been in the stressful position of debugging a slow memory leak that is happening on clients where it’s not possible to get heap snapshots - I am sympathetic to an API that could catch a leak sooner. That said I’ve not come across code that creates an ever growing set of registered symbols. Code I’ve seen uses Symbol.for
to create a fixed number of symbols with static strings, usually to ensure compatibility across separate packages.
anything should be allowed as a WeakMap key
The use cases for this proposal are for patterns where unique symbols are created and then stored in a WeakMap. Not for taking symbols from an unknown source.
Why not then design a less orthogonal API, tailored specifically to that use case?
Introduce WeakMap.prototype.store
that mints a new symbol, uses it as a key to store the given object, and returns that symbol. Using a symbol as a key in WeakMap.prototype.set
would stay an error, unless the symbol is already present in the map. Analogously, introduce WeakSet.prototype.addSymbol
as well.
Hi @fstirlitz,
this has come up in some other threads (I’ll try and find links when not on my phone).
The issue that was raised with an API like that is that it would either be cross-realm and break boundaries like ShadowRealm
or it would be a per-realm store, and having per-realm state was seen as an issue as there is not currently global functions that have per-realm behavior.
EDIT: realised this is not the same as the other suggestions which had the functions on a global factory not on the instance.
@acutmore @rricard when covering the update form yesterday during the SES meeting today, something new popped with respect to this change. Basically, it is going to be very hard now to polyfill well-known symbols, considering that they are, in principle, very similar to registered symbols. I don't know if that was discussed before, but worth discussing it.
I'm not sure why it would; none of the polyfills I know of use the symbol registry, since having Symbol.keyFor(Symbol.whatever)
return "not undefined
" would be a much bigger problem than having a polyfilled well-known-symbol be single-realm.
I filed https://github.com/tc39/proposal-symbols-as-weakmap-keys/issues/27 to discuss this
I've just had a discussion with WH to get a clear picture of his objection to allowing symbols created with
Symbol.for()
asWeakMap
keys.To summarize, This is the use-case that must be disallowed.
Symbol.for('name')
gets used as aWeakMap
key.Symbol.for('name')
exist other than theWeakMap
key and the (virtual)GlobalSymbolRegistry
entry, which don't count. So, both of these are effectively freed.Symbol.for('name')
is called again and the result used to attempt lookup in the sameWeakMap
, finding nothing.The key misunderstanding many of us have had revolves around the assumption that the
GlobalSymbolRegistry
entry defined in the spec must keep theSymbol.for()
keys alive forever, making them effectively the same as well-known symbols that are stored on global objects. This is incorrect.In our conversation WH explained it like this.
So, I believe this proposal & its spec text need to be updated to disallow use of registered symbols as
WeakMap
keys.