webrtc-rs / webrtc

A pure Rust implementation of WebRTC
https://webrtc.rs
Apache License 2.0
4.04k stars 358 forks source link

Differing Codecs in offer m-lines cause the answer to be malformed #357

Open KillingSpark opened 1 year ago

KillingSpark commented 1 year ago

It's me again doing weird stuff with this crate :) This is possibly related to #308

The most minimal example I could come up with is an offer with two m-lines, one with only codec VP9 and one with only codec VP8: (full SDPs attached at the end)

...
m=video 20509 RTP/AVPF 121
a=mid:1
a=rtpmap:121 VP9/90000
a=recvonly
...
m=video 20509 RTP/AVPF 96 97
a=mid:p1_1_2
a=rtpmap:96 VP8/90000
a=sendonly
...

If I set this as a remote offer on a peer connection it produces these transceiver states: Both of these have (only) the codec video/VP9 selected

Transceiver state
    1:
        Recv RTCRtpParameters { header_extensions: [RTCRtpHeaderExtensionParameters { uri: "http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01", id: 7 }, RTCRtpHeaderExtensionParameters { uri: "urn:ietf:params:rtp-hdrext:sdes:mid", id: 4 }], codecs: [RTCRtpCodecParameters { capability: RTCRtpCodecCapability { mime_type: "video/VP9", clock_rate: 90000, channels: 0, sdp_fmtp_line: "max-fs=12288;max-fr=60", rtcp_feedback: [RTCPFeedback { typ: "nack", parameter: "" }, RTCPFeedback { typ: "nack", parameter: "pli" }, RTCPFeedback { typ: "ccm", parameter: "fir" }, RTCPFeedback { typ: "goog-remb", parameter: "" }] }, payload_type: 121, stats_id: "" }] }
    p1_1_2:
        Recv RTCRtpParameters { header_extensions: [RTCRtpHeaderExtensionParameters { uri: "http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01", id: 7 }, RTCRtpHeaderExtensionParameters { uri: "urn:ietf:params:rtp-hdrext:sdes:mid", id: 4 }], codecs: [RTCRtpCodecParameters { capability: RTCRtpCodecCapability { mime_type: "video/VP9", clock_rate: 90000, channels: 0, sdp_fmtp_line: "max-fs=12288;max-fr=60", rtcp_feedback: [RTCPFeedback { typ: "nack", parameter: "" }, RTCPFeedback { typ: "nack", parameter: "pli" }, RTCPFeedback { typ: "ccm", parameter: "fir" }, RTCPFeedback { typ: "goog-remb", parameter: "" }] }, payload_type: 121, stats_id: "" }] }

I didn't add a track on the peer connection in this minimal example but when I encountered it, the RTCRtpSender also had the VP9 codec selected.

The answer produced is this:

...
m=video 9 UDP/TLS/RTP/SAVPF 121
a=mid:1
a=rtpmap:121 VP9/90000
a=sendonly
...
m=video 9 UDP/TLS/RTP/SAVPF 121
a=mid:p1_1_2
a=rtpmap:121 VP9/90000
a=recvonly
...

If a track (and by that a RTCRtpSender) exists on MID 1 this causes an error when setting the answer because of incompatible codecs. In my minimal example I did not add a track and then this doesn't even return an error.

Details

Offer SDP

v=0
o=- 1669535888844 1669535888846 IN IP4 10.2.0.77
s=CROWN
c=IN IP4 10.2.0.77
t=0 0
a=group:BUNDLE 0 1 p1_1_2
a=msid-semantic: WMS -
a=extmap-allow-mixed
a=fingerprint:sha-256 4B:44:B2:F4:C7:6B:F5:76:BD:D2:E6:2D:52:97:40:2B:30:44:B3:55:5D:F1:92:4C:C9:51:33:A9:A2:25:53:79
a=ice-ufrag:5ttuxMt6GE0=
a=ice-pwd:h0QyQvBAcDFpcfG7jxYwM5I1ymHIonkjDTQltftlgF8=
a=ice-options:trickle
a=ice-lite
m=video 20509 RTP/AVPF 121
a=mid:1
a=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=extmap:5 urn:ietf:params:rtp-hdrext:toffset
a=extmap:7 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=rtpmap:121 VP9/90000
a=fmtp:121 max-fs=12288;max-fr=60
a=rtcp-fb:121 nack
a=rtcp-fb:121 nack pli
a=rtcp-fb:121 ccm fir
a=rtcp-fb:121 goog-remb
a=recvonly
a=rtcp-rsize
a=setup:passive
a=candidate:1 1 TCP 2105524479 10.2.0.77 20509 typ host tcptype passive
a=end-of-candidates
a=rtcp-xr:voip-metrics stat-summary=loss,dup,jitt
a=rtcp-mux
m=video 20509 RTP/AVPF 96 97
a=mid:p1_1_2
a=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
a=extmap:13 urn:3gpp:video-orientation
a=extmap:5 http://www.webrtc.org/experiments/rtp-hdrext/playout-delay
a=rtpmap:96 VP8/90000
a=rtcp-fb:96 goog-remb
a=rtcp-fb:96 ccm fir
a=rtcp-fb:96 nack
a=rtcp-fb:96 nack pli
a=rtpmap:97 rtx/90000
a=fmtp:97 apt=96
a=msid:- 4dc9e5d5-4abe-405d-8a36-ad2e06951916
a=sendonly
a=rtcp-rsize
a=ssrc:4068221753 cname:c31f53b4-a9d3-4877-8857-acf85d7ac543
a=ssrc:1423026143 cname:c31f53b4-a9d3-4877-8857-acf85d7ac543
a=ssrc-group:FID 4068221753 1423026143
a=setup:passive
a=candidate:1 1 TCP 2105524479 10.2.0.77 20509 typ host tcptype passive
a=end-of-candidates
a=rtcp-xr:voip-metrics stat-summary=loss,dup,jitt
a=rtcp-mux
Answer SDP

v=0
o=- 5960997078167534231 980898000 IN IP4 0.0.0.0
s=-
t=0 0
a=fingerprint:sha-256 09:68:A4:67:4E:0D:42:08:F6:13:2A:9D:FF:61:B0:6E:CF:D2:32:6E:2E:4E:D2:FA:50:14:DA:34:A4:76:44:62
a=group:BUNDLE 1 p1_1_2
m=video 9 UDP/TLS/RTP/SAVPF 121
c=IN IP4 0.0.0.0
a=setup:active
a=mid:1
a=ice-ufrag:fdLrozfOiZUgYiSE
a=ice-pwd:mKpVgMAEhakBXYkPhzttstVGpgvKgoCq
a=rtcp-mux
a=rtcp-rsize
a=rtpmap:121 VP9/90000
a=fmtp:121 max-fs=12288;max-fr=60
a=rtcp-fb:121 nack 
a=rtcp-fb:121 nack pli
a=rtcp-fb:121 ccm fir
a=rtcp-fb:121 goog-remb 
a=extmap:7 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid
a=sendonly
m=video 9 UDP/TLS/RTP/SAVPF 121
c=IN IP4 0.0.0.0
a=setup:active
a=mid:p1_1_2
a=ice-ufrag:fdLrozfOiZUgYiSE
a=ice-pwd:mKpVgMAEhakBXYkPhzttstVGpgvKgoCq
a=rtcp-mux
a=rtcp-rsize
a=rtpmap:121 VP9/90000
a=fmtp:121 max-fs=12288;max-fr=60
a=rtcp-fb:121 nack 
a=rtcp-fb:121 nack pli
a=rtcp-fb:121 ccm fir
a=rtcp-fb:121 goog-remb 
a=extmap:7 http://www.ietf.org/id/draft-holmer-rmcat-transport-wide-cc-extensions-01
a=extmap:4 urn:ietf:params:rtp-hdrext:sdes:mid
a=recvonly
Wicpar commented 1 year ago

do you think #358 could be related to this ?

KillingSpark commented 1 year ago

I'd doubt it. #358 seems to be about a re-ordering of the media sections, which I had never had issues with

k0nserv commented 1 year ago

Hmm this does seem like a bug. I wonder if the #377 might fix it.

KillingSpark commented 1 year ago

I looked into this a bit more and I think it has to do with how the media engine deals with update_from_remote_description

Excerpt: (the comments are mine)

        for media in &desc.media_descriptions {
            let typ = if !self.negotiated_audio.load(Ordering::SeqCst)
                && media.media_name.media.to_lowercase() == "audio"
            {
                self.negotiated_audio.store(true, Ordering::SeqCst);  // <---- Makes it so that only one media description influences the media engines audio codecs
                RTPCodecType::Audio
            } else if !self.negotiated_video.load(Ordering::SeqCst)
                && media.media_name.media.to_lowercase() == "video"
            {
                self.negotiated_video.store(true, Ordering::SeqCst); // <---- Makes it so that only one media description influences the media engines video codecs
                RTPCodecType::Video
            } else {
                continue;
            };

If I understand correctly only the first media section of a type (video/audio) influences which codecs will be added to the media engines negotiated_{audio,video}_codecs list. This has at least two consequences I encountered:

  1. If two media sections of the same type have differing codecs, the first one wins, and the other ones might even produce an error
  2. If a PrAnswer was set before with differing codecs, the real answer doesn't have a chance to set the actual codecs

Questions that I have no clear answer to yet

  1. Are these checks for an already happened negotiation even necessary?
  2. Related to 1.) Is there any harm in adding negotiated codecs piece by piece?
  3. Is it allowed for PrAnswer and Answer to differ in codecs? (I'd hope so, but I'll need to look it up)

I think if these checks could just be dropped and every mediasection gets its chance to add the necessary codecs to the negotiated codecs this could be solved. I'll try that out with a local copy tomorrow.

Edit: After thinking about this more this seems like a conceptual problem. Should the media engine even have these fields?

pub struct MediaEngine {
    ...
    pub(crate) negotiated_video_codecs: Mutex<Vec<RTCRtpCodecParameters>>,
    pub(crate) negotiated_audio_codecs: Mutex<Vec<RTCRtpCodecParameters>>,
    ...
}

In my understanding these need to be tracked for each media section separately because each media section is allowed to negotiate different codecs.