pion / webrtc

Pure Go implementation of the WebRTC API
https://pion.ly
MIT License
13.91k stars 1.66k forks source link

PeerConnection OnTrack callback never invoked #880

Closed tylerlong closed 5 years ago

tylerlong commented 5 years ago

Your environment.

What did you do?

I have written the following code: https://github.com/ringcentral/ringcentral-softphone-go/blob/master/softphone.go#L137-L220

Just for your reference, here is the Node.js version of the GoLang code: https://github.com/ringcentral/ringcentral-softphone-js/blob/ce833c613d5e4bb959235e37addefd10cc9cb81e/index.js#L140-L177 . The Node.js code works like a charm.

What did you expect?

I expect that the peerConnection.OnTrack callback function will be invoked so that I can get the input audio

What happened?

peerConnection.OnTrack never invoked and I am stuck.

Some questions:

Could it because I specify the wrong codec? If so, I expect some error message like: no codec to handle the remote track. remote SDP sample:

v=0
o=- 5305405169010475891 908790132696250821 IN IP4 104.245.57.182
s=SmcSip
c=IN IP4 104.245.57.182
t=0 0
m=audio 20040 RTP/SAVPF 109 111 18 0 8 9 96 101
a=rtpmap:109 OPUS/16000
a=fmtp:109 useinbandfec=1
a=rtcp-fb:109 ccm tmmbr
a=rtpmap:111 OPUS/48000/2
a=fmtp:111 useinbandfec=1
a=rtcp-fb:111 ccm tmmbr
a=rtpmap:18 g729/8000
a=fmtp:18 annexb=no
a=rtpmap:0 pcmu/8000
a=rtpmap:8 pcma/8000
a=rtpmap:9 g722/8000
a=rtpmap:96 ilbc/8000
a=fmtp:96 mode=20
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=sendrecv
a=rtcp:20041
a=rtcp-mux
a=setup:actpass
a=fingerprint:sha-1 E2:F8:90:C6:37:7E:82:17:3C:63:CF:55:68:4E:1D:16:17:2C:D9:4D
a=ice-ufrag:atwbXWRm
a=ice-pwd:32SCWdSoU7BgkXxet5g5mlId5E
a=candidate:muEZ8Bb5Zhy8XFSL 1 UDP 2130706431 104.245.57.182 20040 typ host
a=candidate:muEZ8Bb5Zhy8XFSL 2 UDP 2130706430 104.245.57.182 20041 typ host

How can I port the Node.JS code to GoLang? Because I know Node.js version works like charm. And the code even works in browser (with browser WebRTC). So if pion/webrtc is compatible with browser WebRTC, I should be able to get this GoLang version work.

Could it becuase of this https://github.com/pion/webrtc/issues/879? I temporarily worked around the issue: https://github.com/ringcentral/ringcentral-softphone-go/blob/master/softphone.go#L133

Sean-Der commented 5 years ago

Hey @tylerlong I think it could be a few different things

tylerlong commented 5 years ago

sha-1 instead of sha-256. We don't support older ciphers, I can add support though to unblock you!

Please do, thanks. There are lots of legacy things in the world that we have to support. I cannot control the remote side so I cannot change its cipher but accept it. Thanks again.

Before Pion support legacy ciphers, it should panic with error message: "unsupported cipher in remote SDP: 'sha-1'", otherwise as a developer I have no idea what happened. Just my two cents. Thanks.

Sean-Der commented 5 years ago

@tylerlong My guess is that if you turned on verbose debugging it probably threw an error message on fingerprint mismatch.

Unfortunately we can't panic because we are a library :/ don't want to hurt other code is running. OnConnectionState should be returning Failed though!

Would it be possible to get an account for ringcentral? Happy to debug, would love to get everything working this weekend!

tylerlong commented 5 years ago

This SDP doesn't contain any SSRCes. Pion isn't expecting to receive any incoming audio without it.

As my understanding, SSRC is optional. https://tools.ietf.org/id/draft-westerlund-mmusic-max-ssrc-00.html#rfc.section.1

This document describes use-cases where endpoints, for a given media type, use multiple media sources.

It is not the use case for me. I don't need multiple media sources.

Many applications and systems have been designed to ensure that any given endpoint only needs to, for a given SDP media description, send or receive a single media stream, associated with a single source.

Above is my use case.

Sean-Der commented 5 years ago

Hey @tylerlong

This is fixed! I am able to use the ringcentral softphone and save the PCM to disk! Unfortunately the oggwriter is hardcoded to support Opus only, if you are interested we could expand it to support PCM if that is helpful!

tylerlong commented 5 years ago

@Sean-Der Thank you so much! That is fast!

I don't know much about audio formats. But overall support more is a good idea.

Let me have a try and get back to you.

tylerlong commented 5 years ago

I tried the latest version github.com/pion/webrtc/v2 v2.1.9 and it still doesn't work for me

PeerConnection OnTrack callback never invoked.

But yesterday I tried it at home it worked. The different is yesterday I was trying it on Windows and today I am trying it on macOS.

Sean-Der commented 5 years ago

@tylerlong

I added a test to assert it was working (and everything seems fine!) my guess would be there is another issue. Probably networking if you are in different environments, many office networks only work with TURN enabled etc..

tylerlong commented 5 years ago

I do see "OnICEConnectionStateChange connected"

SDP:

v=0
o=- 8992091129720114931 3299709293645669172 IN IP4 104.245.57.182
s=SmcSip
c=IN IP4 104.245.57.182
t=0 0
m=audio 39908 RTP/SAVPF 109 111 18 0 8 9 96 101
a=rtpmap:109 OPUS/16000
a=fmtp:109 useinbandfec=1
a=rtcp-fb:109 ccm tmmbr
a=rtpmap:111 OPUS/48000/2
a=fmtp:111 useinbandfec=1
a=rtcp-fb:111 ccm tmmbr
a=rtpmap:18 g729/8000
a=fmtp:18 annexb=no
a=rtpmap:0 pcmu/8000
a=rtpmap:8 pcma/8000
a=rtpmap:9 g722/8000
a=rtpmap:96 ilbc/8000
a=fmtp:96 mode=20
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=sendrecv
a=rtcp:39909
a=rtcp-mux
a=setup:actpass
a=fingerprint:sha-1 27:EF:98:E2:3D:12:18:B6:D7:A9:BA:33:4A:68:F7:3E:B1:63:7C:AC
a=ice-ufrag:57CmAD1c
a=ice-pwd:Yd1CvwSTnTZLRTMBGKUQk1chbK
a=candidate:2uEo5H7EOY3ditac 1 UDP 2130706431 104.245.57.182 39908 typ host
a=candidate:2uEo5H7EOY3ditac 2 UDP 2130706430 104.245.57.182 39909 typ host
tylerlong commented 5 years ago

I am going home to try it on my windows machine. Update you later.

tylerlong commented 5 years ago

Windows

I tried it on my Windows a couple of times. And the conclusion is : sometimes it works while sometimes it doesn't.

The same code, I run it multiple times and I got different result. weird.

When it worked

OnSignalingStateChange have-remote-offer
OnICEConnectionStateChange checking 
OnSignalingStateChange stable

OnICECandidate host 192.168.0.15:61213
OnICECandidate nil

OnICEConnectionStateChange connected 
OnConnectionStateChange connected
OnTrack

Sample SDP when it worked

v=0
o=- 5963121565783506081 4058657120322569797 IN IP4 104.245.57.249
s=SmcSip
c=IN IP4 104.245.57.249
t=0 0
m=audio 36200 RTP/SAVPF 109 111 18 0 8 9 96 101
a=rtpmap:109 OPUS/16000
a=fmtp:109 useinbandfec=1
a=rtcp-fb:109 ccm tmmbr
a=rtpmap:111 OPUS/48000/2
a=fmtp:111 useinbandfec=1
a=rtcp-fb:111 ccm tmmbr
a=rtpmap:18 g729/8000
a=fmtp:18 annexb=no
a=rtpmap:0 pcmu/8000
a=rtpmap:8 pcma/8000
a=rtpmap:9 g722/8000
a=rtpmap:96 ilbc/8000
a=fmtp:96 mode=20
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=sendrecv
a=rtcp:36201
a=rtcp-mux
a=setup:actpass
a=fingerprint:sha-1 40:98:E4:34:C3:11:5B:71:B1:66:B2:C9:D6:BD:9E:52:81:92:1B:6E
a=ice-ufrag:AR9yOAXG
a=ice-pwd:dDvnPaaTk74az3hd1Ai7wSd1cj
a=candidate:rjRIpigWfTeIbojw 1 UDP 2130706431 104.245.57.249 36200 typ host
a=candidate:rjRIpigWfTeIbojw 2 UDP 2130706430 104.245.57.249 36201 typ host

When it didn't work

OnSignalingStateChange have-remote-offer
OnICEConnectionStateChange checking 
OnSignalingStateChange stable

OnICECandidate host 192.168.0.15:56850
OnICECandidate nil

OnICEConnectionStateChange connected

then after around 30 seconds, I got:

OnConnectionStateChange failed
OnICEConnectionStateChange disconnected 
OnICEConnectionStateChange failed 

Sample SDP when it didn't work

v=0
o=- 7452111111381642799 6657573025900769579 IN IP4 104.245.57.249
s=SmcSip
c=IN IP4 104.245.57.249
t=0 0
m=audio 47600 RTP/SAVPF 109 111 18 0 8 9 96 101
a=rtpmap:109 OPUS/16000
a=fmtp:109 useinbandfec=1
a=rtcp-fb:109 ccm tmmbr
a=rtpmap:111 OPUS/48000/2
a=fmtp:111 useinbandfec=1
a=rtcp-fb:111 ccm tmmbr
a=rtpmap:18 g729/8000
a=fmtp:18 annexb=no
a=rtpmap:0 pcmu/8000
a=rtpmap:8 pcma/8000
a=rtpmap:9 g722/8000
a=rtpmap:96 ilbc/8000
a=fmtp:96 mode=20
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=sendrecv
a=rtcp:47601
a=rtcp-mux
a=setup:actpass
a=fingerprint:sha-1 37:6C:DC:44:00:BE:BA:07:0A:86:AF:16:1F:43:49:41:BB:22:41:84
a=ice-ufrag:VwlBdEws
a=ice-pwd:APbbNITDFAVfKGHN1DxM8Zhjt7
a=candidate:7QZebzHBbQModeJU 1 UDP 2130706431 104.245.57.249 47600 typ host
a=candidate:7QZebzHBbQModeJU 2 UDP 2130706430 104.245.57.249 47601 typ host

macOS

Most of the time it fails.

v=0
o=- 3003893328699748873 1358899279031815794 IN IP4 104.245.57.182
s=SmcSip
c=IN IP4 104.245.57.182
t=0 0
m=audio 29424 RTP/SAVPF 109 111 18 0 8 9 96 101
a=rtpmap:109 OPUS/16000
a=fmtp:109 useinbandfec=1
a=rtcp-fb:109 ccm tmmbr
a=rtpmap:111 OPUS/48000/2
a=fmtp:111 useinbandfec=1
a=rtcp-fb:111 ccm tmmbr
a=rtpmap:18 g729/8000
a=fmtp:18 annexb=no
a=rtpmap:0 pcmu/8000
a=rtpmap:8 pcma/8000
a=rtpmap:9 g722/8000
a=rtpmap:96 ilbc/8000
a=fmtp:96 mode=20
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=sendrecv
a=rtcp:29425
a=rtcp-mux
a=setup:actpass
a=fingerprint:sha-1 5D:93:65:C6:1F:93:F7:37:19:3C:C8:76:44:98:BD:B0:9A:E9:90:0B
a=ice-ufrag:4Cv3oxDG
a=ice-pwd:VDAHykTXrVTPzdUtlTDPn9lcQb
a=candidate:bI2GF67wa6ppc8oE 1 UDP 2130706431 104.245.57.182 29424 typ host
a=candidate:bI2GF67wa6ppc8oE 2 UDP 2130706430 104.245.57.182 29425 typ host
OnSignalingStateChange have-remote-offer
OnICEConnectionStateChange checking 

OnSignalingStateChange stable

OnICECandidate host 10.32.21.94:50517
OnICECandidate nil

OnICEConnectionStateChange connected 
OnConnectionStateChange failed
OnICEConnectionStateChange disconnected 
OnICEConnectionStateChange failed 

In rare cases it succeeds:

v=0
o=- 8191891114657166899 1133201147605880630 IN IP4 104.245.57.249
s=SmcSip
c=IN IP4 104.245.57.249
t=0 0
m=audio 23212 RTP/SAVPF 109 111 18 0 8 9 96 101
a=rtpmap:109 OPUS/16000
a=fmtp:109 useinbandfec=1
a=rtcp-fb:109 ccm tmmbr
a=rtpmap:111 OPUS/48000/2
a=fmtp:111 useinbandfec=1
a=rtcp-fb:111 ccm tmmbr
a=rtpmap:18 g729/8000
a=fmtp:18 annexb=no
a=rtpmap:0 pcmu/8000
a=rtpmap:8 pcma/8000
a=rtpmap:9 g722/8000
a=rtpmap:96 ilbc/8000
a=fmtp:96 mode=20
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=sendrecv
a=rtcp:23213
a=rtcp-mux
a=setup:actpass
a=fingerprint:sha-1 97:E3:8A:5B:94:87:0F:C0:42:57:6C:29:5E:EE:EF:18:79:0B:BF:4F
a=ice-ufrag:wchffjDE
a=ice-pwd:Ze2XnqcMIx97EvFNA76NudFjb1
a=candidate:pB33NG4AflUhYaUy 1 UDP 2130706431 104.245.57.249 23212 typ host
a=candidate:pB33NG4AflUhYaUy 2 UDP 2130706430 104.245.57.249 23213 typ host
OnSignalingStateChange have-remote-offer
OnICEConnectionStateChange checking 

OnICECandidate host 10.32.21.94:58878
OnICECandidate nil

OnICEConnectionStateChange connected 
OnConnectionStateChange connected
OnTrack
Sean-Der commented 5 years ago

Let me try on my end tonight!

If you try on MacOS multiple times do you get similar results?

If you can get a pcap comparing good/bad results that would be really helpful! Hopefully I can replace tonight and just fix the issue :)

thanks @tylerlong

tylerlong commented 5 years ago

No hurry. I will post more detail in my comment above.

tylerlong commented 5 years ago

I did some debugging.

// line 1893 - 1902 peerConnection.go
    // Start the dtls transport
    err = pc.dtlsTransport.Start(DTLSParameters{
        Role:         dtlsRole,
        Fingerprints: []DTLSFingerprint{{Algorithm: fingerprintHash, Value: fingerprint}},
    })
    pc.updateConnectionState(pc.ICEConnectionState(), pc.dtlsTransport.State())
    if err != nil {
        pc.log.Warnf("Failed to start manager: %s", err)
        return
    }
dtls: The connection timed out during the handshake

So the root cause is

    err = pc.dtlsTransport.Start(DTLSParameters{
        Role:         dtlsRole,
        Fingerprints: []DTLSFingerprint{{Algorithm: fingerprintHash, Value: fingerprint}},
    })

timeout

tylerlong commented 5 years ago

If I debug into pc.dtlsTransport.Start, then it always succeeds.

So it might has something to do with timing. If we delay the execution of pc.dtlsTransport.Start for 1 second, it always succeed. Not 100% sure though.

Sean-Der commented 5 years ago

I will work on reproducing! I am not sure why DTLS is timing out. Hopefully I can reproduce it on my home network.

If you can use wireshark/tcpdump and get all UDP traffic it would help a lot to figure out what went wrong. If you send it to my pion.ly email I can check it out right away!

Sean-Der commented 5 years ago

@tylerlong do you see any errors server side? It sounds like we are causing an error by sending DTLS traffic too soon maybe?

I will look at what happens when we delay DTLS vs when we don't

tylerlong commented 5 years ago

Unfortunately I cannot see server side log. I don't maintain our server side code and I treat it as a black box when I created the Node.js softphone SDK.

I've sent tcpdump file to you via email.

tylerlong commented 5 years ago

https://github.com/pion/webrtc/blob/master/dtlstransport.go#L256:

dtlsConn, err := dtls.Client(dtlsEndpoint, dtlsConfig)

If I add breakpoint to this line and its next line and "Step Over" to next line, it always work

If I add breakpoint to this line and its next line and "Resume Program" (no matter how long I wait before resume), it often doesn't work.

So I think this might be a concurrency deadlock issue.

"Step Over" is like single threaded, so it never deadlock. "Resume Program" is multiple threaded so it often deadlock and timeout.

Sean-Der commented 5 years ago

@tylerlong

Looking at the pcap we keep DTLS traffic, but the other side never responds. We are running as the DTLS client (and we also are able to choose if we want to run as Client or Server).

Let me get a PR that switches roles and we can see if that helps! Will be done tonight

pascal-pro commented 4 years ago

@Sean-Der / @tylerlong One year later... I try to make the soft-phone work, without success. I used the pawn library several times, so I was happy to find it here. I have the same symptom, no OnTrack, and the connectionState is always checking or failed, but never goes connected.

I also noticed that without the stun, the Sip connection was a failure.

I put the DTLS on the server (as suggested in the PR). I also had to put in a GatheringCompletePromise event, to be able to return an entire SDP. I am trying with the v3 of pion/webrtc

flamurhbreznica commented 3 years ago

I have VideoCall System developed through Ionic Cordova with PeerJS, it works properly on Android. Now that I deployed to iOS from cordova, whenever I make second call, it always set an error: The PluginMediaStream with this id "any id" already exists, and cannot join to calls anymore, also the events from peerjs does not trigger

for example:

peer.on('stream') does not trigger at all, I used this plugin to handle the MediaStreams, but for some reasons, after the first video call it says MediaStream already exists,

How i can I trigger some events if I get a call from peer, onstream, onremove,

I tried to remove streams, tracks, and other stuff after ending video call, but it did not work.

Sean-Der commented 3 years ago

Hi @flamurhbreznica

can you please either

thanks!