[context] suggestion: only keeping weak references to subscribers in providers

webcomponents-cg / community-protocols

Cross-component coordination protocols

179 stars 12 forks source link

[context] suggestion: only keeping weak references to subscribers in providers #64

Closed jsebrech closed 1 month ago

jsebrech commented 1 month ago

The context provider implementation needs to track subscribers, and a natural solution to do so is a Map. The lit/context implementation for example uses a Map. However, this expects all requesters to be on their best behavior and always call unsubscribe on disconnect. It is very likely that many implementations will not do so, and this is likely to lead to memory leaks.

The current context spec only mentions a risk of memory leaks in this paragraph, but this is not sufficient to guard against the above scenario:

The provider MUST NOT retain a reference to the callback nor pass an unsubscribe callback if the context-request event's subscribe property is not truthy. Doing so may cause a memory leak as the consumer may not ever call the unsubscribe callback.

Suggestion: add implementation advice to the context spec that providers use only weak references to track subscribers (e.g. WeakMap) so that all subscribers can be garbage collected even if they never explicitly call the unsubscribe function. In short, subscribing to a context should not prevent garbage collection if that is the only reference to an element or its callback.

Westbrook commented 1 month ago

As a bit of background, it looks like weak maps were discussed in someway via https://github.com/webcomponents-cg/community-protocols/issues/21 and https://github.com/webcomponents-cg/community-protocols/issues/25. Is there anything there directly refuting the benefits of WeakMaps in this context?

If not, the change seems reasonable, though I can't speak to specifically why that might not have been done previously. Would you be available for a PR the targeted change could be reviewed more directly?

justinfagnani commented 1 month ago

WeakMap isn't sufficient to avoid the memory leaks because we need to iterate over the keys of the map, which WeakMap doesn't support.

WeakRefs would work, however. Providers wrap the callback in a WeakRef, and if the element that produced the callback is GC'able, the callback will be too. WeakRefs just weren't widely available at the time the protocol was drafted.

I think we can suggest that WeakRefs are used by providers, but I'm not sure if we can require it or not. For protocol adherence I still think it's good to say that a provider must not hold a reference to a non-subscribed context request, and that a consumer must call the disposal callback when it's cleaned up. That ensures that a consumer doesn't assume that a provider is using a WeakRef, not call the dispose callback, and thus cause a memory leak for a provider not using WeakRefs.

justinfagnani commented 1 month ago

Oh, and we had a big discussion about WeakRef in the Lit implementation a while back because it was impossible to build the ContextRoot correctly without leaking without WeakRef. We went ahead and used it there because ContextRoot is optional, and we could have a different browser support matrix for the context package vs the core lit packages.

WeakRef is widely enough supported now that we would have no problem using it for all of context in our next major version.

jsebrech commented 1 month ago

I agree that requiring consumers keep up their end of the bargain is the right position, but I also think that providers are more likely to respect the spec than consumers. A typical web developer will use someone's else library for creating a context, and then copy/paste snippets for dispatching the context requests, never seeing the spec. A typical library author will seek out the spec and adhere to whatever it requires and recommends.

I don't see an issue with using the language SHOULD instead of MUST for weak references in providers, because that leaves room for adhering to spec without keeping weak references if there's a good reason, but also likely ensures most libraries will be forgiving for consumer user error by using weak references.

I'd be happy to submit a PR. Are there any rules around that or do I just send one in?

The proposed language change in the providers section:

The provider SHOULD use only weak references to retain the callback. Doing this avoids memory leaks for consumers that never call the provided unsubscribe callback.

justinfagnani commented 1 month ago

A typical web developer will use someone's else library for creating a context, and then copy/paste snippets for dispatching the context requests, never seeing the spec. A typical library author will seek out the spec and adhere to whatever it requires and recommends.

A bit of an aside, but: I don't think the necessarily holds. In the Lit ecosystem, there's one library that can provide and consume contexts, so users rarely, if ever, deal with the protocol manually. In other places I don't see why it's more likely that a provider use a library than a consumer. Or that it's more likely that a provider is a library author and the consumer isn't. Plenty of use cases have some kind of component that must have things provided to it via contexts, so the user is being the provider.

I think I'd be ok with some non-normative text, like: "In order to be more robust against non-compliant consumers that don't call the subscription disposer, it's recommended that provider use WeakRefs to reference subscription callbacks."

But I still think it should be clear that these consumers wouldn't be implementing the protocol correctly.