Streampunk / beamcoder

Node.js native bindings to FFmpeg.
GNU General Public License v3.0
397 stars 76 forks source link

Exception SIGSEGV while reconnect RTSP #53

Open heleon19 opened 3 years ago

heleon19 commented 3 years ago

Hello

I would like to use this lib to get JPEG images of an RTSP MPEG stream. I can already establish a connection and receive the single images. When the connection to the camera is lost or if it is restarting, I want to close the demuxer so I can establish a new connection, how can I do that?

If I simply create a new demuxer, I get a SIGSEGV error.

Find attached my test script and the logs. For my test I used beamcoder v0.6.3

Thank you for your support.

test_and_log.zip

scriptorian commented 3 years ago

Hi, thanks for raising the issue. I have just added a forceClose method to the demuxer that should allow you to do what you are trying to do. I believe the garbage collection would have closed the stream eventually but this will allow you to get ahead of that!

heleon19 commented 3 years ago

Hi, thank you for your support. I've tested with the new version an call the forceClose. But still I get SIGSEGV exception.

Once I got the following error message: 2020-10-15 21_02_26-index js — C__DATEN_nodejs — Atom The instruction in 0x0000... refers to memory at 0x0... The read operation could not be performed in memory.

Find my adapted testscript and logs attached. test_and_log_v0.6.5.zip

scriptorian commented 3 years ago

I have tried your code on a public rtsp stream (I used rtsp://wowzaec2demo.streamlock.net/vod/mp4:BigBuckBunny_115k.mov). I'm not getting a timeout but by taking out the keepalive() call from the packet loop I can simulate what is happening. I don't get a crash on restart but I have confirmed that the new forceClose is not doing anything useful in this case, unfortunately.

As a test I have now added a bit of code that tries to make sure the old demuxer has been garbage collected before reconnecting to see if that makes any difference. You will need to run with node --expose-gc index.js and add the following instead of await connection.forceClose() but after connection = null:

      return new Promise(resolve => {
        setTimeout(() => {
          global.gc();
          resolve();
        }, 200);
      });
heleon19 commented 3 years ago

Hi Scriptorian Calling garbage collector befor reconnecting does work. Is there a way to handle this in the beamcoder lib?

How can I disable this ffmpeg log? [rtsp @ 0000026dc3e1c800] max delay reached. need to consume packet [rtsp @ 0000026dc3e1c800] RTP: missed 2 packets [rtsp @ 0000026dc3e1c800] Missing packets; dropping frame. [rtsp @ 0000026dc3e1c800] Missing packets; dropping frame. [rtsp @ 0000026dc3e1c800] max delay reached. need to consume packet [rtsp @ 0000026dc3e1c800] RTP: missed 2 packets [rtsp @ 0000026dc3e1c800] Missing packets; dropping frame.

scriptorian commented 3 years ago

Hi,

I've had another go to make sure the forceClose method does what you need. You should be able to remove the code to force garbage collection now.

scriptorian commented 3 years ago

I've also now added a call beamcoder.logging() which allows you to read or set the FFmpeg logging level. The README file lists the valid log level strings.

heleon19 commented 3 years ago

Hi, thank you again!

The logging function works as expected. When using forceClose I still get an exception sometimes.

Find logs and updated test script attached. test_and_log_v0.6.7.zip

scriptorian commented 3 years ago

I've had another try at shutting down cleanly, hopefully this will be enough!

heleon19 commented 3 years ago

no data timeout triggered disconnect from rtsp://admin:admin@10.1.45.107/1 connect to rtsp://admin:admin@10.1.45.107/1 PID 7916 received SIGSEGV for address: 0x41092b6a SymInit: Symbol-SearchPath: '.;C:\DATEN\nodejs\beamcode;C:\Program Files\nodejs;C:\WINDOWS;C:\WINDOWS\system32;SRVC:\websymbolshttp://msdl.microsoft.com/downl oad/symbols;', symOptions: 530, UserName: 'rje' OS-Version: 10.0.18362 () 0x100-0x1 c:\daten\nodejs\beamcode\node_modules\segfault-handler\src\stackwalker.cpp (924): StackWalker::ShowCallstack c:\daten\nodejs\beamcode\node_modules\segfault-handler\src\segfault-handler.cpp (242): segfault_handler 00007FF9B4C585B6 (ntdll): (filename not available): RtlIsGenericTableEmpty 00007FF9B4C4A056 (ntdll): (filename not available): RtlRaiseException 00007FF9B4C7FE3E (ntdll): (filename not available): KiUserExceptionDispatcher 00007FF941092B6A (avformat-58): (filename not available): avformat_get_riff_audio_tags 00007FF9410A7204 (avformat-58): (filename not available): avformat_get_riff_audio_tags 00007FF9410A9DDB (avformat-58): (filename not available): avformat_get_riff_audio_tags 00007FF9410D4A7B (avformat-58): (filename not available): av_find_default_stream_index 00007FF9410D5D1B (avformat-58): (filename not available): av_find_default_stream_index 00007FF9410D6D58 (avformat-58): (filename not available): av_read_frame c:\daten\nodejs\beamcode\node_modules\beamcoder\src\demux.cc (238): readFrameExecute 00007FF7A099D26E (node): (filename not available): uv_queue_work 00007FF7A098A7CD (node): (filename not available): uv_poll_stop 00007FF7A1658100 (node): (filename not available): v8::internal::SetupIsolateDelegate::SetupHeap 00007FF9B3117BD4 (KERNEL32): (filename not available): BaseThreadInitThunk 00007FF9B4C4CE51 (ntdll): (filename not available): RtlUserThreadStart

heleon19 commented 3 years ago

PID 62040 received SIGSEGV for address: 0x410dd712 SymInit: Symbol-SearchPath: '.;C:\DATEN\nodejs\beamcode;C:\Program Files\nodejs;C:\WINDOWS;C:\WINDOWS\system32;SRVC:\websymbolshttp://msdl.microsoft.com/downl oad/symbols;', symOptions: 530, UserName: 'rje' OS-Version: 10.0.18362 () 0x100-0x1 c:\daten\nodejs\beamcode\node_modules\segfault-handler\src\stackwalker.cpp (924): StackWalker::ShowCallstack c:\daten\nodejs\beamcode\node_modules\segfault-handler\src\segfault-handler.cpp (242): segfault_handler 00007FF9B4C585B6 (ntdll): (filename not available): RtlIsGenericTableEmpty 00007FF9B4C4A056 (ntdll): (filename not available): RtlRaiseException 00007FF9B4C7FE3E (ntdll): (filename not available): KiUserExceptionDispatcher 00007FF9410DD712 (avformat-58): (filename not available): avformat_close_input c:\daten\nodejs\beamcode\node_modules\beamcoder\src\format.cc (3835): formatContextFinalizer 00007FF7A092A158 (node): (filename not available): node::Stop 00007FF7A0FFABD1 (node): (filename not available): v8::internal::GlobalHandles::InvokeSecondPassPhantomCallbacks 00007FF7A0FFAD7E (node): (filename not available): v8::internal::GlobalHandles::InvokeSecondPassPhantomCallbacksFromTask 00007FF7A089CBCC (node): (filename not available): v8::internal::wasm::JSToWasmWrapperCompilationUnit::~JSToWasmWrapperCompilationUnit 00007FF7A089BB51 (node): (filename not available): v8::internal::wasm::JSToWasmWrapperCompilationUnit::~JSToWasmWrapperCompilationUnit 00007FF7A099982B (node): (filename not available): uv_async_send 00007FF7A0998FCC (node): (filename not available): uv_loop_init 00007FF7A0999194 (node): (filename not available): uv_run 00007FF7A08B9B73 (node): (filename not available): EVP_CIPHER_CTX_buf_noconst 00007FF7A0917860 (node): (filename not available): node::Start 00007FF7A07C6ABC (node): (filename not available): RC4_options 00007FF7A161F068 (node): (filename not available): v8::internal::SetupIsolateDelegate::SetupHeap 00007FF9B3117BD4 (KERNEL32): (filename not available): BaseThreadInitThunk 00007FF9B4C4CE51 (ntdll): (filename not available): RtlUserThreadStart

scriptorian commented 3 years ago

It would really help if I could reproduce this here! Is it crashing every time now or intermittently as previously?

heleon19 commented 3 years ago

I understand. Would it help if you had access to my camera? Tryed a couple of times, happend always while the camera reboots.

scriptorian commented 3 years ago

Perhaps simpler for now if I can send you one or two patch cpp files with some extra debug - can you use them and rebuild beamcoder locally?

heleon19 commented 3 years ago

I didn't get it run to build on windows. I do have a Dockerfile to build it on a Raspberry. But I don't get detailed error log. With this Dockerfile I could clone from an test branch.

What version of ffmpeg do I have to use, currently I use 4.3.1

Example with the current master branch: { time: 1603194169741, data: <Buffer ff d8 ff e0 00 10 4a 46 49 46 00 02 01 00 00 01 00 01 00 00 ff db 00 84 00 06 04 05 06 05 04 06 06 05 06 07 07 06 08 0a 10 0a 0a 09 09 0a 14 0e 0f 0c ... 85877 more bytes>, counter: 66 } { time: 1603194170358, data: <Buffer ff d8 ff e0 00 10 4a 46 49 46 00 02 01 00 00 01 00 01 00 00 ff db 00 84 00 06 04 05 06 05 04 06 06 05 06 07 07 06 08 0a 10 0a 0a 09 09 0a 14 0e 0f 0c ... 85965 more bytes>, counter: 67 } no data timeout triggered disconnect from rtsp://admin:admin@10.1.45.107/1 connect to rtsp://admin:admin@10.1.45.107/1 Segmentation fault (core dumped)

scriptorian commented 3 years ago

The javascript binding was keeping a stale pointer around after the force close and this was causing trouble during the garbage collection. I have added an indirection to handle this and hopefully the latest version will be better.

heleon19 commented 3 years ago

I'm very sorry, still I get exception. But not every time.

PID 13752 received SIGSEGV for address: 0x402f1ce5 SymInit: Symbol-SearchPath: '.;C:\DATEN\nodejs\beamcode;C:\Program Files\nodejs;C:\WINDOWS;C:\WINDOWS\system32;SRVC:\websymbolshttp://msdl.microsoft.com/downl oad/symbols;', symOptions: 530, UserName: 'rje' OS-Version: 10.0.18362 () 0x100-0x1 c:\daten\nodejs\beamcode\node_modules\segfault-handler\src\stackwalker.cpp (924): StackWalker::ShowCallstack c:\daten\nodejs\beamcode\node_modules\segfault-handler\src\segfault-handler.cpp (242): segfault_handler 00007FF9B4C585B6 (ntdll): (filename not available): RtlIsGenericTableEmpty 00007FF9B4C4A056 (ntdll): (filename not available): RtlRaiseException 00007FF9B4C7FE3E (ntdll): (filename not available): KiUserExceptionDispatcher 00007FF9402F1CE5 (avformat-58): (filename not available): avformat_get_riff_audio_tags 00007FF9402F20E9 (avformat-58): (filename not available): avformat_get_riff_audio_tags 00007FF9402F2D4A (avformat-58): (filename not available): avformat_get_riff_audio_tags 00007FF940307204 (avformat-58): (filename not available): avformat_get_riff_audio_tags 00007FF940309DDB (avformat-58): (filename not available): avformat_get_riff_audio_tags 00007FF940334A7B (avformat-58): (filename not available): av_find_default_stream_index 00007FF940335D1B (avformat-58): (filename not available): av_find_default_stream_index 00007FF940336D58 (avformat-58): (filename not available): av_read_frame c:\daten\nodejs\beamcode\node_modules\beamcoder\src\demux.cc (245): readFrameExecute 00007FF7A099D26E (node): (filename not available): uv_queue_work 00007FF7A098A7CD (node): (filename not available): uv_poll_stop 00007FF7A1658100 (node): (filename not available): v8::internal::SetupIsolateDelegate::SetupHeap 00007FF9B3117BD4 (KERNEL32): (filename not available): BaseThreadInitThunk 00007FF9B4C4CE51 (ntdll): (filename not available): RtlUserThreadStart

scriptorian commented 3 years ago

I think this one is a race condition. If it's now always crashing at demux.cc readFrameExecute as here than I'm fairly convinced. You are calling the async function demuxer.read and sometimes this might run (in a separate thread) after the context is deleted. I was hesitant to add something like a mutex around this but I'm not sure what you can do from JS without me doing that. Let me know what you think.

heleon19 commented 3 years ago

I don't know how I could implement such a mutex around demuxer.read. I think my code does one by one, and there should be only one connection open. Do you think something in my test script is wrong?

Find attached my script: index.txt

scriptorian commented 3 years ago

I wasn't clear enough - if a mutex is needed I have to add it in beamcoder land. The problem is that you are calling read (or in fact the call is being executed) after you have forceClose'd the demuxer. Your loop tests for connection being valid but that doesn't protect against this problem. It seems possible for you to add some code so that just before you call the stop or forceClose function you ensure that the read isn't going to be called.