ibc commented 8 years ago

Let's begin from how things are done in SDP O/A land:

Alice sends an offer to Bob indicating her capabilities (which codecs are supported, etc).
Bob replies an answer indicating his capabilities and selecting the exact codecs and features Alice is allowed to send.
Such a "capabilities negotiation" implicitly defines which codecs will be used in both directions (although any negotiated codec can be used at any time).

To summarize, SDP O/A provides codec capabilities negotiation + implicit codec selection.

In the other side, ORTC does not define a specific negotiation mechanism. Both parties are supposed to exchange their capabilities somehow. Let's assume such a exchange is done. After that:

Alice is allowed to create a video RtpSender and call send() by passing both VP8 and VP9 codecs into the RtpParameters. Even more: she can specify different SSRC streams for both codecs. Even worse: that means that two separate video streams could be sent (which makes no sense unless this is SVC and so on).
Q: Why? If capabilities are already exchanged Alice should decide a single video codec and should instruct her RtpSender to encode using such a codec.
Again: Alice can call send() with both VP8 and VP9.
Q: What does it mean? which codec will the browser choose to encode the video?

A rationale for these questions may be the fact that ORTC defines RTX, FEC, etc also as codecs, so multiple "codecs" in RtpParameters make sense. But due to this design, we end with cross-reference structs of codecs and encodings that allow "anything" (or the same in multiple ways).

aboba commented 8 years ago

Currently, both the WebRTC 1.0 API object model and ORTC allow an RtpSender to specify more than one codec, with the primary codec being the one used to send.

In WebRTC 1.0, this enables the list of codecs to be reordered so as to enable a change of sending codec, without having to initiate an O/A exchange:

// After every call to // setLocalDescription and // setRemoteDescription: var p = sender.getParameters(); p.codecs = reorderCodecs(p.codecs); sender.setParameters(p);

In ORTC, the equivalent functionality is provided:

var sender = new RTCRtpSender(...); // Only once, but choose the PTs var codecs = reorderCodecs( sender.getCapabilities().codecs); sender.send({codecs: codecs, ...});

Currently, the ORTC spec is not clear enough about the importance of codec ordering in sender.send() (e.g. primary codec is the one used to send). Some examples may help.

Also, with respect to using different codecs for each encoding, it appears that the specification could use some clarification.

There are two potential use cases that come to mind:

Using one codec for a base layer encoding (e.g. H.264/AVC) and another one for extension layers (e.g. H.264/SVC).
Simulcasting streams using different codecs (e.g. one simulcast stream using VP9, another one using VP8).

Use case 1 is specific to H.264. This is not something that Edge currently supports (e.g. the H.264/SVC implementation does utilize an H.264/AVC compliant base layer, but only “H.264UC” needs to be configured to enable that).

There is an example in the RID draft (see Section 11.1 of https://tools.ietf.org/html/draft-ietf-mmusic-rid) that appears to support both use cases 1 and 2:

“In this scenario, the offerer supports the Opus, G.722, G.711 and DTMF audio codecs, and VP8, VP9, H.264 (CBP/CHP, mode 0/1), H.264-SVC (SCBP/SCHP) and H.265 (MP/M10P) for video. An 8-way video call (to a mixer) is supported (send 1 and receive 7 video streams) by offering 7 video media sections (1 sendrecv at max resolution and 6 recvonly at smaller resolutions), all bundled on the same port, using 3 different resolutions. ”

The example demonstrates negotiation between the VP8, VP9, H.264/AVC and H.264/SVC (!!) codecs. However, I am not sure to what extent browsers actually plan to support this.

murillo128 commented 8 years ago

IMHO we should avoid having two ways of doing same thing within the API.

I understand the need of having several encoding for same codec within an RTPSender, while I don't understand is why there should be allowed several codecs in an RTPSender vs several RTPSenders with each with one codec.

Moreover, nothing in the specs restricts the usage of two unrelated codecs (lets say VP8 and H264) simultaneously on same RTPSender.

I don't feel that the 11.1 scenario is relevant to the current issue, as it only sends 1 stream. Even if you reverse the SDP, each receive track has a different source, so it would require a different RTPSender anyway.

aboba commented 8 years ago

@murillo128 The ORTC specification is not clear about what it means to allow several unrelated codecs simultaneously on the same RtpSender. IMHO, in WebRTC 1.0 the intent seems to be that only the primary codec is to be sent. IMHO, that would be the most logical interpretation for ORTC as well.

Also, we could use clarification about sending multiple codecs in different encodings. It is not clear to me that this is prohibited in WebRTC 1.0 (or in ORTC).

robin-raymond commented 8 years ago

You can have several codecs active within the same sender when simulcasting with multiple encodings. Otherwise, only one codec is in use, i.e. the chosen codec via the payload type of the first relevant codec in the list. Regardless, multiple codecs are needed currently to define FEC/RTX codec properties, or do define things like CN, DTMF, etc.

murillo128 commented 8 years ago

What is the benefit of allowing several codecs in one RTPSender vs having several RTPSender each with one codec and simplifying the API?

On simulcasting/SVC you would still be able to have several encodings for a single codec.

@robin-raymond:

Regardless, multiple codecs are needed currently to define FEC/RTX codec properties, or do define things like CN, DTMF, etc.

I was taking into consideration that FEC/RTX,etc would be removed from the "codecs", if not, obviously, this is not viable.

robin-raymond commented 8 years ago

As the sender and receiver currently consume the same parameters structure, the receiver must absolutely have multiple encodings for the simulcast scenario to function properly. This allows for rapid switching between received streams. Obviously, an argument could be make to make them specific for sender/receiver, but this change would certainly mandate that happen.

As for simulcast on the sender, the advantage is when they share the same properties, e.g. like the MuxID. This creates an implicit understanding that this is the same stream encoded multiple ways. It's not as strong a use case as the receiver but that's is allowed.

At this time, those pseudo codecs are required in the codec list. I'd like to see a more "encodings" centric approach in the future than codec approach right now but I think that we are too far into this model to do that kind of change right now, especially in context of trying to remain 1.0 compatible. I think we should work on that model in a post first ORTC spec official release. At that point we should review all the encodings/parameters/codecs to argue different models for the most sane and reasonable one long term.

There are also consequences to a encodings centric model vs the current payload/codec specific model we must carefully think through. Specifically, the routing rules get more complex because there is no shared understanding of a codec for a receiver for simulcasting with RTC and FEC involved and that can create some interesting challenges.

It's not so simple unfortunately...

murillo128 commented 8 years ago

One possible use case for "multiple simultaneous" sending codecs could be a multiconference with low end devices.

You would ask all high end devices to send VP9 with several layers "at full power", and one VP8 at much lower resolution/bandwidth. Then SFU will send VP9 to high end devices and VP8 to low end ones.

In this case you could do one RTPSender with VP8 and VP9 with multiple layers or one RTPSender with VP8 and one RTPSender with VP9 and multiple layers.

The only case that could not be covered is in the H264SVC/H264 AVC in simulcasting with the H264 AVC encoding dependent of the H264SVC one.

murillo128 commented 8 years ago

@robin-raymond

As the sender and receiver currently consume the same parameters structure, the receiver must absolutely have multiple encodings for the simulcast scenario to function properly.

440 ;)

At that point we should review all the encodings/parameters/codecs to argue different models for the most sane and reasonable one long term.

From my experience in SDOs, if we start with one model, we will stick to it in future versions because of "backward compatibility". So, IMHO, if we need to address this topics, it would be better to do it sooner than later.

aboba commented 8 years ago

@robin-raymond In WebRTC 1.0, simulcast only needs to be supported on the RtpSender. The assumption is that the SFU modifies the SSRC and PT fields so that the RtpReceiver only receives a single RTP stream whose properties may change depending on what stream the SFU has chosen to forward to it.

Instead of splicing together multiple input streams into a single stream sent to the RtpReceiver, the SFU could instead send multiple streams to the RtpReceiver. Since packets can be reordered, it is possible that the streams will be intermingled, so that the RtpReceiver would need to be able to reassemble the multiple streams and feed them to the decoder in the correct order. This can be a non-trivial task. For SVC in MRST mode, the streams either have to support DON encapsulation or RFC 6051. The specification does not state what features are required to support receiving of simulcast, but it probably should.

aboba commented 8 years ago

@murillo128 I would suggest that ORTC start off by supporting the WebRTC 1.0 simulcast model, which involves sending multiple streams, but receiving only one. This can support not only simulcast, but also spatial simulcast combined with temporal scalabilty (VP8), and VP9/SVC scenarios with both temporal and spatial scalability. It should also be possible to support H.264 simulcast with this limited model.

About the only thing that you won't be able to do with the simulcast Sender/non-simulcast Receiver model is to support H.264/SVC (as Edge does). But if simulcast reception is only needed for a single codec, one might argue that it is not a "must have".

ibc commented 8 years ago

I'd like to see a more "encodings" centric approach in the future than codec approach right now but I think that we are too far into this model to do that kind of change right now, especially in context of trying to remain 1.0 compatible. I think we should work on that model in a post first ORTC spec official release.

That won't happen. Once ORTC 1.0 is published and EDGE implements it there won't be any chance to such an important change and we'll have to deal with the current jsonified SDP for ever and ever. Seriously.

murillo128 commented 8 years ago

@aboba I am perfectly happy with only allowing receiving one simulcast-stream.

ibc commented 8 years ago

I am perfectly happy with only allowing receiving one simulcast-stream.

Me too. But then we should wonder why we have an API that allows passing several streams/encodings/ssrcs to the same RtpReceiver.

robin-raymond commented 8 years ago

Me too. But then we should wonder why we have an API that allows passing several streams/encodings/ssrcs to the same RtpReceiver.

Early on we discussed doing multiple RtpReceivers versus having a single RtpReceiver for simulcasting. Since each simulcast stream was a separate RtpReceiver but the argument lost because in the CG meeting because:

each RtpReceiver would emit its own MediaStreamTrack;
each track would have to be manually switch by the programmer to the render element based upon the active RtpReceiver stream in the simulcast;
the CG felt this was an unneeded burden on the developer since the only "right thing to do" would be to auto-switch between those simulcasted streams which is best suited to do by the engine;
this allows early wiring of the render element from the MediaStreamTrack no matter which simulcast stream happened to be active;
if a developer really wanted manual switching, they always can wire up separate RtpReceiver streams for each simulcast and manually switch between tracks;
Encodings could be auto-filled based upon whatever happened to be the active simulcasted stream.

That's my summary but that's how it went down to answer your "why"...

robin-raymond commented 8 years ago

The current design requires multiple codecs, and even if we remove RTX, FEC as codecs, we still need multiple codecs. I also don't want to convert this into a bigger "proposal" bug (which should have it's own issue by itself). So I'm going to flag this as "won't fix" since the current design requires multiple codecs thus nothing to fix.

aboba commented 8 years ago

This represents a change to the WebRTC 1.0 object model, and therefore would need to be proposed and accepted by the WEBRTC WG.

robin-raymond commented 8 years ago

I'm going to close this issue because it requires a different proposal issue for the 1.0 WG, and as of right now it's "wontfix" or "as designed".

w3c / ortc

RtpSender.send() should just allow a single codec #439

440 ;)