CatxFish / obs-v4l2sink

obs studio output plugin for Video4Linux2 device
GNU General Public License v2.0
930 stars 99 forks source link

Obs Studio crashes when stopping the v4l2 sink #10

Open nihathrael opened 5 years ago

nihathrael commented 5 years ago

Hi, first of all, thanks for the great plugin. Currently the V4l2 Sink will crash my OBS Studio 23.0.1-1 on Arch Linux (Linux laptop 5.0.0-arch1-1-ARCH #1 SMP PREEMPT Mon Mar 4 14:11:43 UTC 2019 x86_64 GNU/Linux), when I press the stop button (reproducable every time). Because of this I can not change any video settings as OBS will always think a output is currently active (I guess?).

This is the backtrace from running it in gdb:

(gdb) bt
#0  0x00007ffff5060f41 in free () from /usr/lib/libc.so.6
#1  0x00005555567d6780 in ?? ()
#2  0x00007ffff604c2b2 in obs_output_actual_stop () from /usr/lib/libobs.so.0
#3  0x00007ffff604cd71 in obs_output_stop () from /usr/lib/libobs.so.0
#4  0x00007ffff579587c in QMetaObject::activate(QObject*, int, int, void**) () from /usr/lib/libQt5Core.so.5
#5  0x00007ffff7b69eb3 in QAbstractButton::clicked(bool) () from /usr/lib/libQt5Widgets.so.5
#6  0x00007ffff7b6a0cc in ?? () from /usr/lib/libQt5Widgets.so.5
#7  0x00007ffff7b6b4c2 in ?? () from /usr/lib/libQt5Widgets.so.5
#8  0x00007ffff7b6b696 in QAbstractButton::mouseReleaseEvent(QMouseEvent*) () from /usr/lib/libQt5Widgets.so.5
#9  0x00007ffff7abdb68 in QWidget::event(QEvent*) () from /usr/lib/libQt5Widgets.so.5
#10 0x00007ffff7a7ce24 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /usr/lib/libQt5Widgets.so.5
#11 0x00007ffff7a84929 in QApplication::notify(QObject*, QEvent*) () from /usr/lib/libQt5Widgets.so.5
#12 0x00007ffff576ae99 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () from /usr/lib/libQt5Core.so.5
#13 0x00007ffff7a83c08 in QApplicationPrivate::sendMouseEvent(QWidget*, QMouseEvent*, QWidget*, QWidget*, QWidget**, QPointer<QWidget>&, bool, bool) () from /usr/lib/libQt5Widgets.so.5
#14 0x00007ffff7ad8e93 in ?? () from /usr/lib/libQt5Widgets.so.5
#15 0x00007ffff7adbf87 in ?? () from /usr/lib/libQt5Widgets.so.5
#16 0x00007ffff7a7ce24 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /usr/lib/libQt5Widgets.so.5
#17 0x00007ffff7a846e1 in QApplication::notify(QObject*, QEvent*) () from /usr/lib/libQt5Widgets.so.5
#18 0x00007ffff576ae99 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () from /usr/lib/libQt5Core.so.5
#19 0x00007ffff5b3d96e in QGuiApplicationPrivate::processMouseEvent(QWindowSystemInterfacePrivate::MouseEvent*) () from /usr/lib/libQt5Gui.so.5
#20 0x00007ffff5b3edd6 in QGuiApplicationPrivate::processWindowSystemEvent(QWindowSystemInterfacePrivate::WindowSystemEvent*) () from /usr/lib/libQt5Gui.so.5
#21 0x00007ffff5b1875c in QWindowSystemInterface::sendWindowSystemEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQt5Gui.so.5
#22 0x00007fffeb98490c in ?? () from /usr/lib/libQt5XcbQpa.so.5
#23 0x00007fffef679a2f in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#24 0x00007fffef67b5e9 in ?? () from /usr/lib/libglib-2.0.so.0
#25 0x00007fffef67b62e in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0
#26 0x00007ffff57c0ce9 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQt5Core.so.5
#27 0x00007ffff5769b2c in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib/libQt5Core.so.5
#28 0x00007ffff5771e36 in QCoreApplication::exec() () from /usr/lib/libQt5Core.so.5
#29 0x00005555555de0cc in main ()

Let me know if I can provide any additional data.

CatxFish commented 5 years ago

I have some hardware problem on my Linux machine , I'll check it later. The previous output seems not stop properly, there might be some issue there.

update: Can't reproduce on Ubuntu , try to do more digging.

darkmattercoder commented 4 years ago

Same for me here on arch linux.

However I have no debugging symbols atm.

info: Output 'V4l2sink': stopping
info: Output 'V4l2sink': Total frames output: 56
info: Output 'V4l2sink': Total drawn frames: 57
[New Thread 0x7fff870a1700 (LWP 43932)]
[Thread 0x7fff870a1700 (LWP 43932) exited]

Thread 1 "obs" received signal SIGSEGV, Segmentation fault.
0x00007ffff4db9460 in free () from /usr/lib/libc.so.6
kautsig commented 4 years ago

I was also affected by this, it occurred when using the AUR package. At the time of writing it is marked as outdated, see [1]

To rule out an issue with the AUR package, I built obs-v4l2sink manually from source and installed it in my home directory (OBS and obs-v4l2sink on master). This made the issue go away.

Because I assumed that building against an older version of OBS might be the issue, I built the AUR package [2] based on obs-v4l2sink 0.1.0 against OBS 25.0.3 - same issue again.

This means to me, that whatever made it work on the master branches should be within this diff:

Looking at the diff, I don't spot anything obvious. Now I'm pretty much out of ideas.

[1] https://aur.archlinux.org/packages/obs-v4l2sink/ [2] https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=obs-v4l2sink

fenugrec commented 4 years ago

Hi, I might have some information to add . I've compiled debugging versions of obs-studio @ gdbb453d00 (current master), and obs-v4l2sink @ 1ec3c8ada0e1040d867ce567f177be55cd278378 (current master).

It seems the following happens: libobs/obs-output.c:390 : 390 output->info.stop(output->context.data, ts);

that is resolved as v4l2sink_stop(); calls v4l2device_close(); segfaults on the close(out_data->v4l2_fd); line

When I debug, that fd is non-null; I can't tell if it was previously close'd somewhere just before ?

Or is it related to v4l2sink_videotick() using the fd after it being closed ? [EDIT] I tried to replace v4l2sink_videotick() with an empty function : still segfaults in v4l2device_close()

I also tried adding a close(out_data->v4l2_fd) at the end of v4l2device_open() to see if there was a problem with the V4L2 device itself but that didn't segfault.

fenugrec commented 4 years ago

Ok, my previous thoughts were wrong. Turns out the problem is the missing return in v4l2device_close() : that warning (fixed in PR #21 ) is actually undefined behavior. Compare the disassembly below. In both cases v4l2device_close() is inlined within v4l2sink_stop(), but when that return is removed it breaks the function completely.

************ WITHOUT a return statement *********

.text:0000000000005450             ; void __fastcall v4l2sink_stop(void *data, uint64_t ts)
.text:0000000000005450             _ZL13v4l2sink_stopPvm proc near       
.text:0000000000005450             ts = rsi                                ; uint64_t
.text:0000000000005450             out_data = rdi                          ; v4l2sink_data *

......... obs_output_end_data_capture(out_data->output);
.text:0000000000005457 53                          push    rbx
.text:0000000000005458 48 89 FB                    mov     rbx, out_data
.text:000000000000545B C6 47 08 00                 mov     byte ptr [out_data+8], 0
.text:000000000000545F 48 8B 3F                    mov     out_data, [out_data]
.text:0000000000005462             out_data = rbx                          ; v4l2sink_data *
.text:0000000000005462 FF 15 D8 58+                call    cs:obs_output_end_data_capture_ptr

.......... printf("stop pdata=%p, fd=%p\n", data, out_data->v4l2_fd);   // temp debugging
.text:0000000000005462 00 00
.text:0000000000005468 48 8D 3D C7+                lea     rdi, format     ; "stop pdata=%p, fd=%p\n"
.text:0000000000005468 2C 00 00
.text:000000000000546F 48 89 DE                    mov     rsi, out_data
.text:0000000000005472 31 C0                       xor     eax, eax
.text:0000000000005474 8B 53 0C                    mov     edx, [out_data+0Ch]
.text:0000000000005477 FF 15 1B 59+                call    cs:printf_ptr

...........v4l2device_close(data);

.text:0000000000005477 00 00
.text:000000000000547D 8B 7B 0C                    mov     edi, [out_data+0Ch] ; fd
.text:0000000000005480 FF 15 6A 59+                call    cs:close_ptr
.text:0000000000005480 00 00       ; } // starts at 5450

.......... undefined behavior. No idea what this is even supposed to be, but it doesn't work.
.text:0000000000005480             _ZL13v4l2sink_stopPvm endp
.text:0000000000005480
.text:0000000000005486                             db      2Eh
.text:0000000000005486 66 2E 0F 1F+                nop     word ptr [rax+rax+00000000h]
.text:0000000000005486 84 00 00 00+
.text:0000000000005486 00 00

************ WITH a return statement ************

text:0000000000005EF0             ; void __fastcall v4l2sink_stop(void *data, uint64_t ts)
.text:0000000000005EF0             _ZL13v4l2sink_stopPvm proc near 
.text:0000000000005EF0             ts = rsi                                ; uint64_t
.text:0000000000005EF0             out_data = rdi                          ; v4l2sink_data *

......... obs_output_end_data_capture(out_data->output);
.text:0000000000005F00 53                          push    rbx
.text:0000000000005F01 48 89 FB                    mov     rbx, out_data
.text:0000000000005F04 C6 47 08 00                 mov     byte ptr [out_data+8], 0
.text:0000000000005F08 48 8B 3F                    mov     out_data, [out_data]
.text:0000000000005F0B             out_data = rbx                          ; v4l2sink_data *
.text:0000000000005F0B FF 15 2F 4E+                call    cs:obs_output_end_data_capture_ptr

.......... printf("stop pdata=%p, fd=%p\n", data, out_data->v4l2_fd);   // temp debugging
.text:0000000000005F0B 00 00
.text:0000000000005F11 48 89 DE                    mov     rsi, out_data
.text:0000000000005F14 48 8D 3D 01+                lea     rdi, format     ; "stop pdata=%p, fd=%p\n"
.text:0000000000005F14 23 00 00
.text:0000000000005F1B 31 C0                       xor     eax, eax
.text:0000000000005F1D 8B 53 0C                    mov     edx, [out_data+0Ch]
.text:0000000000005F20 FF 15 72 4E+                call    cs:printf_ptr

...........v4l2device_close(data);
.text:0000000000005F20 00 00
.text:0000000000005F26 8B 7B 0C                    mov     edi, [out_data+0Ch] ; fd
.text:0000000000005F29 FF 15 C1 4E+                call    cs:close_ptr
.text:0000000000005F29 00 00

...........v4l2sink_signal_stop("stop", false);
.text:0000000000005F2F 31 F6                       xor     esi, esi        ; opening
.text:0000000000005F31 48 8D 3D A0+                lea     rdi, msg        ; "stop"
.text:0000000000005F31 23 00 00
.text:0000000000005F38 5B                          pop     out_data
.text:0000000000005F39 FF 25 39 4E+                jmp     cs:_Z20v4l2sink_signal_stopPKcb_ptr ; v4l2sink_signal_stop(char const*,bool)
.text:0000000000005F39 00 00       ; } // starts at 5EF0
.text:0000000000005F39             _ZL13v4l2sink_stopPvm endp

TL;DR summary: just merge PR #21 !

kautsig commented 4 years ago

@fenugrec Awesome!

But this means my debugging above was wrong. It could be that I fixed the warning along the way, not knowing this could cause the issue. I should definitely stay away from C code. 😅