w3c / webtransport

WebTransport is a web API for flexible data transport
https://w3c.github.io/webtransport/
Other
838 stars 51 forks source link

Allow more aggressive garbage collection #560

Closed jan-ivar closed 7 months ago

jan-ivar commented 11 months ago

A WebTransport object is entangled with its remote server, kept alive while "[[State]] is either "connecting" or "connected"."

This is because readables shouldn't go away unless their source goes away — This matches the JS situation where someone has to keep the reference to the controller to enqueue any data which keeps the whole stream alive — When that source is remote, it's only when the network connection ends that no more input can happen and readables can be collected.

But this leaks objects and network connections if objects are dropped by JS.

Both WebSocket and WebRTC do better, by requiring things to have both sources AND sinks, allowing for more aggressive garbage collection.

Specifically, WebSocket looks at event handlers as intents to consume. IOW, if there are no listeners for open, message, error, or close, then input cannot be observed anyway, and things can be collected sooner.

The closest thing to intent to consume I find in WHATWG streams seems to be locking a readable.

Proposal:

Specify that if there are no locks on any of the WebTransport's readables, then input cannot be observed, and readables may be collected even while a remote is "connecting" or "connected".

Of course the only time this matters is when there are no JS references to the WebTransport or any of its member objects directly holding things alive.

jan-ivar commented 10 months ago

@ricea, @nidhijaju, @saschanaz does this sound reasonable?

saschanaz commented 10 months ago

Some early thoughts:

  1. This seems to require some way to observe locking/unlocking of readable streams, correct?
  2. Forgetting to unlock any open stream will still prevent GC, but maybe the inactivity will trigger some timeout at the server side and eventually close the stream?
jan-ivar commented 10 months ago

Garbage collection is largely unspecified and (mostly) unobservable by design, so it seems fair to leave its details to user agents. E.g. WebSocket doesn't specify how the user agent knows if event listeners registered.

Forgetting to unlock any open stream will still prevent GC, but maybe the inactivity will trigger some timeout at the server side and eventually close the stream?

We can always iterate and go further here, and any implementation is free to experiment with more aggressive GC, but there are probably diminishing returns.

Today, simply creating WebTransports in a for-loop leaks resources, which I think most web developers would find surprising, and the proposal here addresses that.

If someone locks the streams then that seems similar to forgetting to unregister an event handler, so I think we have parity with WebSocket.

saschanaz commented 10 months ago

Garbage collection is largely unspecified and (mostly) unobservable by design, so it seems fair to leave its details to user agents. E.g. WebSocket doesn't specify how the user agent knows if event listeners registered.

Sure, but the condition to allow garbage collection is what you want to specify here, isn't it?

saschanaz commented 10 months ago

Oh hmm, WebSocket says "it must not be GC-ed if ..." rather than "it can be GCed if ...". I guess it's fine then.