Closed palak8669 closed 2 hours ago
This issue was mentioned in WEBRTCWG-2023-09-12 (Page 48)
As discussed during yesterday's editor's call, here is some feedback for this issue.
This 1 pc restriction is similar to the restriction to preserve the ordering of chunks. It prevents potential footguns. It removes edge cases. It helps browser interoperability. It is consistent with the WebRTC media pipeline and encoded transform model. As it is currently, WebRTC encoded transform is nothing but an encoder post-processor or a decoder pre-processor. No encoder/decoder -> no processor.
For instance, if a website tries to write too much data, the UA will drop frames before encoder to enforce bandwidth allocation. The transform will not receive any new chunk until the sender gets back some allocated bandwidth slot. Another example is that the web app cannot send data through the sender transform of a recvonly transceiver. Ditto on receive side where track may be muted. These are good properties.
Removing this restriction would change the model of WebRTC encoded transform, and more generally of the WebRTC media pipeline. We should favour proposals that can implement the use cases by staying consistent with the existing model we agreed on.
AIUI, the use cases can be implemented by new APIs, located elsewhere than WebRTC encoded transform. Two examples come to mind:
sender.enableMediaForwarding({ ... })
or sender.replaceTrack(track, { forwarding: true, ... });
. That API could be complemented by progressively adding knobs to control the exact forwarding behaviour, as we discover more needs from applications. This is inline with the current RTCPeerConnection model we all like.Above proposals can lead to good interop and good user experience (minimum latency, minimum CPU overhead...).
On the other hand, those restrictions prevent solving the "late fanout" usecase, and other usecases that can't be solved under those restrictions using this API. I believe (based on implementation experience) that a frame-level API is easier to use and harder to make mistakes in than a packet-level API for many of those use cases, and that making such an API available to users is a benefit for the Web platform.
One possibility that may satisfy both our constituencies is to redefine the API; if people are willing to live with the restriction of "one input, one output", they can use the ScriptTransform API; if people desire to go outside those restrictions, they use a different means of instantiating the input and output streams of frames that are the essential parts of this API for those purposes.
if people desire to go outside those restrictions, they use a different means of instantiating the input and output streams of frames that are the essential parts of this API for those purposes.
I thought about this approach a while ago. It is cleaner in the sense that it clearly states to the UA that (focusing on sender side) the JS application is responsible to implement the source+encoder part (which late fanout is clearly about). As an example, it will help UA provide meaningful WebRTC stats. Setting a track on such a sender could throw...
Exposing such API requires us to refine the WebRTC media pipeline model as we would expose things that are fully internal right now. This might not be unrelated to the TPAC Media/WebRTC WG discussion meeting I missed.
Related to this approach, my first questions would be:
To refine the WebRTC media pipeline model, there are a number of questions that might be useful to tackle, for instance:
Hope this helps moving forward.
I thought about this approach a while ago. It is cleaner in the sense that it clearly states to the UA that (focusing on sender side) the JS application is responsible to implement the source+encoder part (which late fanout is clearly about). As an example, it will help UA provide meaningful WebRTC stats. Setting a track on such a sender could throw...
Can this be addressed by passing an extra optional parameter to the RTCRtpScriptTransform constructor?
Exposing such API requires us to refine the WebRTC media pipeline model as we would expose things that are fully internal right now. This might not be unrelated to the TPAC Media/WebRTC WG discussion meeting I missed.
Harald's congestion control proposal is in line with this. https://github.com/w3c/webrtc-encoded-transform/pull/207 Also, depending on the use case this might or might not be an issue. There is a lot of value in optionally lifting the restriction to support valid use cases.
Related to this approach, my first questions would be:
The frame level vs. packet level question. It would be great to see whether we can get consensus within the WG. For instance, if we have a packet level API, do we also need a frame level API? Do we need both? @jan-ivar's question: should fanout be done in JS or done by UA with web app tuning it via knobs?
Some advantages of the frame-level approach:
A packet-level API, depending on how it's made, can have the advantage of allowing forwarding to start without waiting for the whole frame, and is able to deal with packet loss directly, but has other disadvantages for this use case, such as:
WDYT?
Status: This PR has been languishing for 3 months now. The addition of a pure "sendonly" API is under consideration, but seems to require more time (#211), and the discussion so far has not shown a compelling difference from the interface that is effectively created if you don't attach a sending track to a transceiver.
The use case for forwarding has been accepted. No use case where the restriction serves a positive purpose (enabling an use case) has been described; the only argument put forward is that it gives the opportunity to fire an error when people try unexpected things.
We should consider again whether we should remove the restriction from the spec.
The addition of a pure "sendonly" API is under consideration
It seems this API got some 'room consensus' at last WG meeting. I'll be happy to help moving this forward.
No use case where the restriction serves a positive purpose (enabling an use case) has been described
A number of reasons for this restriction have been mentioned in this thread.
We should consider again whether we should remove the restriction from the spec.
I am not sure discussing this again will be as fruitful as making progress on the pure "send only" API.
No longer pursuing. We will focus on the RTCRtpEncodedSource proposal.
Preview | Diff