murat-dogan / node-datachannel

WebRTC For Node.js and Electron. libdatachannel node bindings.
Mozilla Public License 2.0
291 stars 55 forks source link

addIceCandidate is blocking the event-loop #281

Open Aviv1000 opened 2 weeks ago

Aviv1000 commented 2 weeks ago

In my case there are times when it blocks the running of the entire node program sometimes for over 10 seconds. I discovered that the blocking of the event-loop comes from addIceCandidate and there may also be a block from setLocalDescription and setRemoteDescription I use polyfill to be compatible with the browser API, but I don't think it has anything to do with it.

i think maybe its happen when the remote candidate contains an IPv6 address. { candidate: 'candidate:995260933 1 udp 1677732095 ce81:ce81:ce81:ce81:ce81:ce81:ce81:ce81 54885 typ srflx raddr :: rport 0 generation 0 ufrag 5s6n network-cost 999', sdpMLineIndex: 0, sdpMid: '0' }

Is this a bug in node-datachannel or can it be handled through my code only? What is the technical reason for the problem?

paullouisageneau commented 2 weeks ago

I think it could be caused by the libjuice issue reported in https://github.com/paullouisageneau/libjuice/discussions/264: if DNS resolution for ICE servers fails, subsequent ICE calls for the same peer connection could block due to incorrect locking. It was fixed by https://github.com/paullouisageneau/libjuice/pull/267 and will be shipped alongside the next libdatachannel version.

Aviv1000 commented 2 weeks ago

But this happens specifically when adding this remote candidate. he dont need to deal with DNS resolution. Maybe it's because of invalid port number 0? raddr :: rport 0 (chrome browser generated this candidate like this to the nodejs)

paullouisageneau commented 2 weeks ago

But this happens specifically when adding this remote candidate. he dont need to deal with DNS resolution.

The issue could make ICE calls block, in particular setting remote description and candidates, because the ICE agent held a lock while resolving STUN/TURN servers in parallel which can take a while, especially in case of DNS failure.

Maybe it's because of invalid port number 0? raddr :: rport 0 (chrome browser generated this candidate like this to the nodejs)

This is fine, raddr and rport are allowed to be 0.