ietf-wg-avtcore / draft-ietf-avtcore-hevc-webrtc

Other
5 stars 2 forks source link

Issues with receive-only codecs #22

Open aboba opened 9 months ago

aboba commented 9 months ago

Problems have been found in the specification as well as the implementation of the WebRTC API (setCodecPreferences) when receive-only codecs are present. This problem surfaced originally in H.264, but is also affecting H.265.

The concern is whether an implementation that only supports receiving H.265 (e.g. Chrome) will successfully negotiate HEVC reception with an implementation that can both send and receive (e.g. Safari Tech Preview).

References https://github.com/w3c/webrtc-pc/issues/2936 https://github.com/w3c/webrtc-pc/issues/2933 https://github.com/w3c/webrtc-pc/pull/2935 https://github.com/w3c/webrtc-pc/issues/2888

aboba commented 9 months ago

Question: Can we use setCodecPreferences() API to cause the following SDP O/A negotiations to happen with the desired results? Assume we have Safari Tech Preview that can send/receive H.265, and Chrome that can receive but not send H.265.

aboba commented 9 months ago

draft-uberti-rtcweb-rfc8829bis Section 4.2.6 says:

The setCodecPreferences method sets the codec preferences of a transceiver, which in turn affect the presence and order of codecs of the associated "m=" section on future calls to createOffer and createAnswer. Note that setCodecPreferences does not directly affect which codec the implementation decides to send. It only affects which codecs the implementation indicates that it prefers to receive, via the offer or answer. Even when a codec is excluded by setCodecPreferences, it still may be used to send until the next offer/answer exchange discards it. The codec preferences of an RtpTransceiver can cause codecs to be excluded by subsequent calls to createOffer and createAnswer, in which case the corresponding media formats in the associated "m=" section will be excluded. The codec preferences cannot add media formats that would otherwise not be present. The codec preferences of an RtpTransceiver can also determine the order of codecs in subsequent calls to createOffer and createAnswer, in which case the order of the media formats in the associated "m=" section will follow the specified preferences.

fippo commented 9 months ago

Case 1: should work as intended, the answer will signal H265. Case 2 will not work out of the box as recvonly codecs are not offered on a sendrecv transceiver.

I do suspect we need the "add additional codecs in the answer" that JSEP talks about...

aboba commented 9 months ago

@fippo In Case 1, with a Safari TP Offer with H.265/H.264 will a Chrome Answer include H.265 on a send/recv m-line?

aboba commented 9 months ago

RFC 3264 Section 5.1 says:

"For a sendonly stream, the offer SHOULD indicate those formats the offerer is willing to send for this stream. For a recvonly stream, the offer SHOULD indicate those formats the offerer is willing to receive for this stream. For a sendrecv stream, the offer SHOULD indicate those codecs that the offerer is willing to send and receive with."

This seems consistent with the idea that a sendonly m-line should only include codecs/profiles it can send; that a recv-only m-line should only include codecs/profiles that it can receive; and that a send/recv m-line should only include codecs/profiles that it can both send and receive.

That would imply that a send/recv m-line should not include codecs/profiles that can only be sent or can only be received.

cdh4u commented 9 months ago

RFC 3264 Section 5.1 says:

"For a sendonly stream, the offer SHOULD indicate those formats the offerer is willing to send for this stream. For a recvonly stream, the offer SHOULD indicate those formats the offerer is willing to receive for this stream. For a sendrecv stream, the offer SHOULD indicate those codecs that the offerer is willing to send and receive with."

This seems consistent with the idea that a sendonly m-line should only include codecs/profiles it can send; that a recv-only m-line should only include codecs/profiles that it can receive; and that a send/recv m-line should only include codecs/profiles that it can both send and receive.

That would imply that a send/recv m-line should not include codecs/profiles that can only be sent or can only be received.

Correct.

aboba commented 9 months ago

@cdh4u

JSEP has this note (https://rtcweb-wg.github.io/jsep/#rfc.section.4.2.6):

Note that setCodecPreferences does not directly affect which codec the implementation decides to send. It only affects which codecs the implementation indicates that it prefers to receive, via the offer or answer.

Based on the above, does this make sense to you for a send-only m-line?

cdh4u commented 9 months ago

@cdh4u

JSEP has this note (https://rtcweb-wg.github.io/jsep/#rfc.section.4.2.6):

Note that setCodecPreferences does not directly affect which codec the implementation decides to send. It only affects which codecs the implementation indicates that it prefers to receive, via the offer or answer.

Based on the above, does this make sense to you for a send-only m-line?

No, not based on the text in JSEP.

However, when I read the W3C spec, I can't find any text indicating that it would only be the preferred order of codecs for receiving.

The text says the following about setCodecPreferences:

"This method allows applications to disable the negotiation of specific codecs"

So, based on that the method can also be used to control what codecs are put in the SDP to begin with, which I assume is something that you may want to do also in the case of sendonly.

The W3C text also says:

"The codecs sequence passed into setCodecPreferences can only contain codecs that are returned by   RTCRtpSender.getCapabilities(kind) or RTCRtpReceiver.getCapabilities(kind),"

...which sounds like you could use it also for sendonly.

And, there is a note saying:

"If set, the offerer's codec preferences will decide the order of the codecs in the offer."
aboba commented 9 months ago

@cdh4u Yes, the W3C text is fairly clear about the role of SCP, and this does not seem consistent with JSEP.

cdh4u commented 9 months ago

@cdh4u Yes, the W3C text is fairly clear about the role of SCP, and this does not seem consistent with JSEP.

The main problem, however, does not seem to be related to SCP. The problem is that some codecs can only be either sent or received, and there is no way to indicate per-codec directions in SDP. The direction applies to the whole m- line.

You could use multiple m- lines, but that would probably cause issues, at least with non-browser endpoints. In the discussions I have seen some suggestions using the rtcmap attribute, but that seems like a hack to me, and would not work with non-browser endpoints either. Harald suggested a new mechanism which would allow to set the direction per codec, That would of course not be backward compatible either, but would probably be the "cleanest" solution.

fippo commented 9 months ago

Yes, sCP makes things worse because it leads to an API footgun there you do a sendonly m-line but asked for a receive-only codec as the only option which leads to https://github.com/w3c/webrtc-pc/issues/2939

Using unidirectional m-lines avoids the issue but for sendrecv it gets hard to understand the state/result just from the SDP in case 1. Let me elaborate a bit:

Offer from Safari TP with a send/recv m-line preferring H.265, then H.264

This is offering codecs that Safari (support encode/decode of both) can send and receive.

-- Answer from Chrome with send/recv m-line with H.265 and H.264

Since Chrome (assuming it can only decode H.265) it can put H.265 into the answer but since it can not send it will only send H.264. This is fine since Safari TP said it can decode H264. But it will not know that it is not going to receive H.265 which means it can not do something like freeing the associated HW decoder resource.

I believe this is covered by https://www.rfc-editor.org/rfc/rfc3264#section-6.1

For streams marked as sendrecv in the answer, the "m=" line MUST contain at least one codec the answerer is willing to both send and receive, from amongst those listed in the offer. The stream MAY indicate additional media formats, not listed in the corresponding stream in the offer, that the answerer is willing to send or receive (of course, it will not be able to send them at this time, since it was not listed in the offer).

For case 2 the problem is that Chrome (assuming it can only decode H.265) would not include H.265 in the sendrecv offer because of what is written in 5.1 of RFC 3264:

For a sendrecv stream, the offer SHOULD indicate those codecs that the offerer is willing to send and receive with.

However, there is a loophole (that is not currently supported by libWebRTC but that is an implementation issue): the answer may contain additional codecs (section 6.1)

The stream MAY indicate additional media formats, not listed in the corresponding stream in the offer, that the answerer is willing to receive

which is also supplemented by JSEP's https://www.rfc-editor.org/rfc/rfc8829.html#name-initial-answers

In either case, the media formats in the answer MUST include at least one format that is present in the offer but MAY include formats that are locally supported but not present in the offer, as mentioned in [RFC3264], Section 6.1

How widely supported that is outside of browsers...

In theory we are good... 🤞

fippo commented 9 months ago

But there is a minor headache related to sCP. JSEP says

Otherwise, each "m=" section in the answer MUST then be generated as specified in [RFC3264], Section 6.1. For the "m=" line itself, the following rules MUST be followed:

  • ...
  • ...
  • If codec preferences have been set for the associated transceiver, media formats MUST be generated in the corresponding order, regardless of what was offered, and MUST exclude any codecs not present in the codec preferences.
  • ... Any currently available media formats that are not present in the current remote description MUST be added after all existing formats
  • In either case, the media formats in the answer MUST include at least one format that is present in the offer but MAY include formats that are locally supported but not present in the offer

If sticking strictly to this, the answerer would not be able to prefer H.265 and put it as the first format?

aboba commented 9 months ago

@fippo To add to the headaches, there is the matter of profiles. In a send/recv offer, Safari TP can only include a profile/level that it can both send and receive. If there is asymmetry, separate send-only and recv-only m-lines are needed in the Offer to express the profiles/levels that can be sent and those that can be received.

juberti commented 8 months ago

I think both Case 1 and Case 2 should work. While it could be argued that since H.265 is recvonly for Chrome, it shouldn't be offered on its sendrecv transceiver, that seems overly strict. Chrome should offer both H.265 and H.264 and then just choose to send H.264 while receiving H.265, the same way it would if, for some reason, it determined it had insufficient compute to send H.265 and had to fall back to H.264.

Generally, I think that any situation where you get different behavior depending on who is playing the offerer role is going to lead to a suboptimal experience; we should try to avoid such non-deterministic situations.

cdh4u commented 8 months ago

Chrome should offer both H.265 and H.264 and then just choose to send H.264 while receiving H.265,

That would work assuming that the answer also contains H.264.

aboba commented 1 month ago

At TPAC 2024, the W3C WEBRTC WG decided on "Proposal A" which is aligned with RFC 3264 Section 5.1.

Related: https://github.com/w3c/webrtc-pc/issues/3006