Open ricea opened 3 years ago
@jan-ivar
The design document for the optimization is here.
Please see the "Detecting termination of the source realm" section in it for the abondoned idea.
@ricea is the request to relax spec behavior to make erroring the stream achievable, or allow it to outlive the source realm?
The latter seems undesirable for several reasons:
In the particular case we are concerned about (cameras etc), the underlying source has a strong link with the originating realm, and is stopped on context destruction, but runs on a different execution thread.
I read the behavior that we'll be observing with transferable streams from cameras/microphones opened in the originating realm (a Window) to be conformant with the current spec (stream ends when source ends). But this is not necessarily true for all usages of transferable streams.
Note: Non-native (shimmed) streams are, as far as I understand, impossible to make transferable, so the fact that their behavior doesn't match the spec for transferable streams doesn't matter.
Note: Non-native (shimmed) streams are, as far as I understand, impossible to make transferable
Sure, but it's trivial to transfer a TransformStream for all your JS needs. I phrased it poorly, I should have said:
The semantics here appear to be those of establishing a data tunnel. The tunnel closes when either side seizes to exist.
I don't have a strong position, other than if the optimization changes the semantics, it is no longer an optimization.
Happy to have either conversation if you can tell me which one it is.
I think "the tunnel closes when either side ceases to exist" is the semantics we want. The question is where the two sides live in the case where one is sourced from a non-main-thread source and the other one is transferred from main-thread to a worker. I'd argue that the "other end" of that tunnel lives in the source, not in the main-thread.
In the case where both ends are transferred into different contexts, it would seem a bit odd if the lifetime of the tunnel depends on the place of creation, which is no longer involved in the transfer.
(But as mentioned before, this concern doesn't apply to streams sourced off cameras; the camera will die when the owning context does, even if its stream generation is done off-thread.)
In the case where both ends are transferred into different contexts, it would seem a bit odd if the lifetime of the tunnel depends on the place of creation ...
I think we need to be careful to distinguish semantics from optimizations. It looks like only one end is semantically transferred in streams. That means that an unoptimized user agent would have no choice but to terminate when the place of creation seizes to exist. Since optimizations are optional, I think it follows that any optimized user agent would be required to not break those semantics, for web compatibility.
The alternative would be to require all user agents to perform the optimization, at which point it is no longer an optimization, but a change in semantics, and we should specify that.
Note we're also looking at transferring the source directly in https://github.com/w3c/mediacapture-extensions/pull/21.
Note: Since this discussion is being referenced in an argument around the standardization of the Breakout Box API that Chrome has shipped, it would be good if people other than Jan-Ivar and I could chime in.
My position is still that the two ends of a stream that is surfaced as a ReadableStream need to be considered differently; the transfer of a ReadableStream transfers the consumer end; it does nothing to change where the origin of the stream is attached, and any attachment of the origin to the originating context should not be assumed by default, but specified by the specification of that particular source type.
In particular, transferring a stream of video frames from a native source such as a camera should not mandate any interaction between the originating context and the stream of video frames. (For cameras and microphones, there is a binding to the context where these streams originate by virtue of the fact that permission to use is bound to the originating context, so when that context goes away, the streams end. This is orthogonal to the question being raised here.)
Maybe the solution is simply to declare that a specific set of platform streams are "fully detachable" and standardise their behaviour explicitly? I was hoping to make optimisation transparent to developers, but it seems that might not be realistically achievable.
The spec says transferable streams are: "a special kind of identity transform which has the writable side in one realm and the readable side in another realm … to implement ... cross-realm transforms."
IOW only one side is transferred, to create tunnels between threads on purpose, NOT to solve streams being created on the wrong thread in the first place.
Maybe the solution is simply to declare that a specific set of platform streams are "fully detachable" and standardise their behaviour explicitly?
I think that's what it would take, since the semantics are different. But I wouldn't standardize a new concept here.
MSTP/MTSG is creating these streams on the wrong thread. An alternative API is being proposed in https://github.com/w3c/mediacapture-transform/issues/59 that instead takes advantage of transferable MediaStreamTrack, which would let us close this.
FWIW, I have modified the mediacapture-streams proposal to allow MSTP/MSTG to run on any thread - because it makes sense to do so for some scenarios.
Given that transferable streams exist, and have worked well in practice, I see no reason to mandate where the streams are created.
We could model a "native stream" as being a ReadableStream
created in a separate realm, and immediately transferring it to the current realm. That way, we already have a "cross-realm transform readable" within the current realm, so we can transfer it again without worrying about whether the current realm will stay alive. Of course, this assumes we fix #1063 first... 😅
Something along the lines of:
ReadableStream
in some (user-agent specific) Realm.This is still a bit vague though: we have to synchronously run some steps inside a different realm, and get the results in the current realm. That's because all of the internal promises from the pullAlgorithm, plus the pipe created by ReadableStream
's transfer steps must be able to outlive the current realm.
We could make this more explicit by creating the cross-realm transform streams directly, rather than using the transfer and transfer receiving steps. This requires a bunch more extra ceremony, though:
MessagePort
in the current Realm.MessagePort
in the current Realm.ReadableStream
in the current Realm.ReadableStream
's transfer-receiving steps. Yes, I know, we're doing things the wrong way around. 😛 WritableStream
in the current Realm.ReadableStream
's transfer steps.ReadableStream
in the current Realm.The disadvantage with this approach is that these native streams must always live in some other realm, even if they are never transferred and only used from the current realm. This might defeat the benefits of the optimization... 😬
I wonder if we're overthinking this.....
There is an entity that has to "move along with the ReadableStream" to preserve sanity - containing at least the queue of outstanding items. Do we have to make that object (and, if I understand correctly, the promises that the pullAlgorithm creates) Javascript observable in any realm?
If they are not observable, we can let the algorithm specify that these things must have the same effect as if it was implemented with promises in some realm, and cover the "optimizations" by the generic language that says "whatever implementation that produces the same observable effect as this algorithm is correct".
We may be clawing back observability that previously existed, which is a compatibility issue.
Chrome implements an optimisation for native streams that effectively allows the underlying source to move to the new thread when they are transferred to a worker. This is a really good optimisation, but a strict reading of the standard doesn't permit it.
The problem is that the standard requires a transferred stream to stop working when the original source realm is destroyed. However, when Chrome optimises a native stream transfer it becomes completely detached from the original source realm and so it keeps working.
We did have a prototype which errored the stream when the original context was destroyed, but it didn't actually match the observed behaviour of transferred streams, which is more complex. Precisely emulating the real behaviour does not seem achievable.
We'd like to relax the standard language in a way that makes Chrome's optimisation conformant.
This is related to #1063, but fixing that wouldn't fix this.