feross / simple-peer

📡 Simple WebRTC video, voice, and data channels
MIT License
7.43k stars 974 forks source link

Segmentation fault from node-webrtc when accessing RTCPeerConnection.localDescription #812

Open dguenther opened 3 years ago

dguenther commented 3 years ago

What version of this package are you using?

What operating system, Node.js, and npm version?

I haven't tested this in other environments.

What happened?

When creating, connecting, and destroying several SimplePeer instances in Node.js, wrtc crashes with a segmentation fault:

PID 4228 received SIGSEGV for address: 0x0
0   segfault-handler.node               0x00000001046bdfb0 _ZL16segfault_handleriP9__siginfoPv + 304
1   libsystem_platform.dylib            0x00007fff203b3d7d _sigtramp + 29
2   ???                                 0x0000000200583232 0x0 + 8595714610
3   wrtc.node                           0x00000001151e4e86 _ZNK6webrtc22JsepSessionDescription8ToStringEPNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEE + 38
4   wrtc.node                           0x0000000115028fe2 _ZN11node_webrtc9ConverterIPKN6webrtc27SessionDescriptionInterfaceENS_25RTCSessionDescriptionInitEE7ConvertES4_ + 82
5   wrtc.node                           0x0000000115029b60 _ZN11node_webrtc9ConverterINSt3__14pairIN4Napi3EnvEPKN6webrtc27SessionDescriptionInterfaceEEENS3_5ValueEE7ConvertES9_ + 48
6   wrtc.node                           0x000000011510416f _ZN11node_webrtc17RTCPeerConnection19GetLocalDescriptionERKN4Napi12CallbackInfoE + 111
7   wrtc.node                           0x0000000115119bdb _ZZN4Napi10ObjectWrapIN11node_webrtc17RTCPeerConnectionEE29InstanceGetterCallbackWrapperEP10napi_env__P20napi_callback_info__ENKUlvE_clEv + 139
8   wrtc.node                           0x0000000115119a9a _ZN4Napi10ObjectWrapIN11node_webrtc17RTCPeerConnectionEE29InstanceGetterCallbackWrapperEP10napi_env__P20napi_callback_info__ + 42
9   node                                0x000000010006b94a _ZN6v8impl12_GLOBAL__N_123FunctionCallbackWrapper6InvokeERKN2v820FunctionCallbackInfoINS2_5ValueEEE + 122
10  node                                0x0000000100a0bacd Builtins_CallApiCallback + 173
[1]    4227 segmentation fault  npm start

Reproduction case

I created a repository with a demo, and also pasted example code below. Unfortunately it's not deterministic, but when running it in 4 windows, it crashes before 1000 iterations in at least one of them.

https://github.com/dguenther/simple-peer-issue-demo

Click to expand example code block ```ts require('segfault-handler').registerHandler('segfault.log') const SimplePeer = require('simple-peer') const wrtc = require('wrtc') const LOOP_TIME_MS = 70 function getRandomInt(min, max) { min = Math.ceil(min); max = Math.floor(max); return Math.floor(Math.random() * (max - min + 1)) + min; } let iteration = 0 const initiators = [] const recipients = [] async function eventLoop() { console.log(`Iteration ${++iteration}`) while (initiators.length > 20) { const conn = initiators.splice(getRandomInt(0, initiators.length - 1), 1)[0] conn.destroy() } while (recipients.length > 20) { const conn = recipients.splice(getRandomInt(0, initiators.length - 1), 1)[0] conn.destroy() } for (let i = 0; i < 4; i++) { const recip = new SimplePeer({ initiator: false, wrtc }) const init = new SimplePeer({ initiator: true, wrtc }) recip.on('signal', (signal) => { if (!init.destroyed) init.signal(signal) }) init.on('signal', (signal) => { if (!recip.destroyed) recip.signal(signal) }) initiators.push(init) recipients.push(recip) } setTimeout(eventLoop, LOOP_TIME_MS) } eventLoop() ```

What did you expect to happen?

No crash 😄 Since ultimately it should be node-webrtc's responsibility to manage itself without crashing, I've created an issue here: https://github.com/node-webrtc/node-webrtc/issues/696

However, I noticed that removing this._pc.localDescription from this line fixes the crash:

https://github.com/feross/simple-peer/blob/d972548299a50f836ca91c36e39304ef0f9474b7/index.js#L618

Several WebRTC examples seem to pass localDescription to the other peer rather than passing the offer itself, so I wasn't sure if there was a reason for that, or if this is a viable workaround.

Are you willing to submit a pull request to fix this bug?

👍 Yep, if one is necessary.

draeder commented 1 year ago

@dguenther This appears to be an issue with the wrtc library itself. I tried your workaround by removing this._pc.localDescription from line 618, but still got the segmentation fault. Is node-wrtc even being maintained anymore? It has been 3 years since any update to the code. Is there an alternative library?

If it helps, I'm using Node v20.3.1 on a 2020 Macbook Pro MacOS 13.4.1 (22F82)

dguenther commented 1 year ago

I've been running node-datachannel successfully for a time, but I've only tested the data channels, not the media support.

Another option I've seen is werift-webrtc. When I tested it a while ago, the performance was not great relative to node-datachannel and node-wrtc, but judging by the GitHub issues, it looks like there have been some improvements since.

draeder commented 1 year ago

@dguenther Perfect! werift appears to be a drop-in replacement for node-wrtc that works great with simple-peer. Many thanks!

draeder commented 1 year ago

@dguenther Well, crud. werift-webrtc has the same segmentation fault issue as node-webrtc.

dguenther commented 1 year ago

Is there an issue for it? Offhand I wouldn't expect that, since as far as I know werift doesn't use any native modules.

draeder commented 1 year ago

So, I think the error was in my own code for both node-wrtc and for werift-webrtc.

I am using simple-peer to create a partial mesh, and I think my partial mesh peer rebalancing logic created the problem. I have had my partial mesh network up and running fine since last night with no errors or crashes in both node and the browser. I am back to using node-wrtc.

My research into this bug suggests that it occurs when peers quickly disconnect/reconnect repeatedly. I think that's what was happening in my case.

draeder commented 1 year ago

@dguenther Here's what I found out. I am building on webtorrent-hybrid. It uses node-webrtc v0.4.6. Not knowing that, I thought I still needed to pass in an instance of wrtc. Furthermore, I thought I needed to develop a partial-mesh networking protocol. I didn't realize that webtorrent-hybrid already handles that for me. The issue is a memory addressing issue, and the conflicts I describe appear to have created it..