feross / simple-peer

📡 Simple WebRTC video, voice, and data channels
MIT License
7.36k stars 972 forks source link

Error with multiple offers per RTCPeerConnection setup #494

Closed corwin-of-amber closed 5 years ago

corwin-of-amber commented 5 years ago

I have encountered several scenarios where setRemoteDescription has been called more than once. This can happen if the sdp message is delayed on its way from one client, to the signalhub, to the second client. This mostly happens in slow networks with a high packet drop rate (e.g. 4G).

This is what's printed when debug is turned on:

simple-peer [783e416] destroy (error: InvalidStateError: Failed to execute 'setRemoteDescription' on 'RTCPeerConnection': Failed to set remote offer sdp: Called in wrong state: kHaveLocalOffer) +2ms
nazar-pc commented 5 years ago

Almost always such cases are result of incorrect handling of signaling events. So it would be really nice to have reproducible test case such that we can confirm fix and make sure it doesn't regress.

corwin-of-amber commented 5 years ago

Yes, that would be ideal. Unfortunately that only happens "sometimes".

Also, looking at the error message I pasted before, it does not seem that my fix is sufficient. The erroneous state is when the pc has a local offer, I am not sure why this is an invalid state for setRemoteDescription.

nazar-pc commented 5 years ago

Hm... make sure you're setting it on the correct peer. At least some code sample would be useful, I saw similar issues many times in the past.

corwin-of-amber commented 5 years ago

Ok, I am using signalhub/signalhubws for the signaling, and https://github.com/mafintosh/webrtc-swarm for setting up the connections, so it is possible that the problem is caused by one of those. They have been pretty reliable so far but messages do tend to get stalled when the connection is bad.

I am attaching an example source and a trace of simple-peer leading to the error.

```javascript var swarm = require('webrtc-swarm') var //signalhub = require('signalhub'), signalhubws = require('signalhubws'); const node_require = require; /* bypass browserify */ const node_ws = (typeof WebSocket === 'undefined') ? node_require('websocket').w3cwebsocket : undefined; var hub = //signalhub('swarm-example', ['https://signalhub-jccqtwhdwc.now.sh']) signalhubws('swarm-example', ['wss://signalhubws.mauve.moe'], node_ws); var swarmConfig = (typeof RTCPeerConnection === 'undefined') ? {wrtc: node_require('wrtc')} : {}; var sw = swarm(hub, swarmConfig); sw.on('peer', function (peer, id) { console.log('connected to a new peer:', id) console.log('total peers:', sw.peers.length) peer.on('data', (message) => { console.log(`message from peer ${id}:`, message.toString('utf-8')); }); }) sw.on('disconnect', function (peer, id) { console.log('disconnected from a peer:', id) console.log('total peers:', sw.peers.length) }) function broadcast(message) { for (let p of sw.peers) p.send(message); } if (typeof window !== 'undefined') { Object.assign(window, {sw, broadcast}); } ``` ``` simple-peer [697bca0] new peer {wrtc: undefined, initiator: true, channelConfig: undefined, config: undefined, stream: undefined, …} +0ms simple-peer [697bca0] _needsNegotiation +9ms simple-peer [697bca0] starting batched negotiation +2ms simple-peer [697bca0] start negotiation +1ms simple-peer [697bca0] signalingStateChange have-local-offer +4ms simple-peer [697bca0] createOffer success +1ms simple-peer [697bca0] signal +0ms simple-peer [697bca0] iceStateChange (connection: new) (gathering: gathering) +1ms simple-peer [697bca0] iceCandidate {"candidate":"candidate:2558760555 1 udp 2113937151 5320b2e9-3c58-4c0e-8c32-0cbc78cb97de.local 62463 typ host generation 0 ufrag k8o+ network-cost 999","sdpMid":"0","sdpMLineIndex":0} +2ms simple-peer [697bca0] started iceComplete timeout +0ms simple-peer [697bca0] iceCandidate {"candidate":"candidate:842163049 1 udp 1677729535 79.179.119.92 62463 typ srflx raddr 0.0.0.0 rport 0 generation 0 ufrag k8o+ network-cost 999","sdpMid":"0","sdpMLineIndex":0} +157ms simple-peer [697bca0] iceStateChange (connection: new) (gathering: complete) +37ms simple-peer [697bca0] iceCandidate null +1ms simple-peer [697bca0] signal() +237ms simple-peer [697bca0] destroy (error: InvalidStateError: Failed to execute 'setRemoteDescription' on 'RTCPeerConnection': Failed to set remote offer sdp: Called in wrong state: kHaveLocalOffer) +4ms ```
corwin-of-amber commented 5 years ago

369 seems relevant to this, but AFAICT only one end is initiating; more careful study of webrtc-swarm is needed to confirm this though.

nazar-pc commented 5 years ago

So those are logs from one side. What the other side says?

corwin-of-amber commented 5 years ago

So those are logs from one side. What the other side says?

It is rather unfortunate but I did not have debug enabled on the other side at that particular point 😩 Will update if this comes up again when I can catch both ends.

corwin-of-amber commented 5 years ago

Ok, I managed to reproduce this more-or-less consistently by running two instances of swarm.js at the very same time like so:

node swarm.js & ; node swarm.js

Looking at the logs, it seems that indeed both RTCPeerConnections are created as initiators. So like you said originally, this is a signaling problem, probably caused by webrtc-swarm in this case. Thanks!

nazar-pc commented 5 years ago

I'm closing this issue then, but feel free to continue discussion if necessary