ptitSeb / box86

Box86 - Linux Userspace x86 Emulator with a twist, targeted at ARM Linux devices
https://box86.org
MIT License
3.28k stars 226 forks source link

zoom crash on camera access libturbojpeg.so:"???", for accessing (nil) #268

Closed q4a closed 3 years ago

q4a commented 3 years ago

Hi. I'm getting crash with zoom on camera access (open video settings or start new meeting with camera). I'm using Armbian with XFCE + Box86 with Dynarec v0.1.3 4f06cfa0 built on Oct 6 2020 15:10:19 + Zoom 5.3.469451.0927 (from zoom_i686.tar.xz) + old Genius Facecam 3000 - it works on my sbc (ASUS Tinker Board) with Cheese. Some info about camera:

$ v4l2-ctl --list-devices
rockchip,rk3288-vpu-enc (platform: hantro-vpu):
    /dev/video3
    /dev/video4
    /dev/media1

rockchip-rga (platform:rga):
    /dev/video2

FaceCam 3000: FaceCam 3000 (usb-ff540000.usb-1.1):
    /dev/video0
    /dev/video1
    /dev/media0

$ lsusb | grep 0458:707a
Bus 001 Device 003: ID 0458:707a KYE Systems Corp. (Mouse Systems) USB2.0 Hub

https://linux-hardware.org/index.php?id=usb:0458-707a

I run zoom with commands: box86 zoom and BOX86_DLSYM_ERROR=1 BOX86_TRACE_FILE=trace.txt box86 zoom - same crash.

$ BOX86_DLSYM_ERROR=1 BOX86_TRACE_FILE=trace.txt box86 zoom
BOX86 Trace redirected to "trace.txt"
Box86 with Dynarec v0.1.3 4f06cfa0 built on Oct  6 2020 15:10:19
zoom started.
Client: Breakpad is using Single Client Mode! client fd = -1
[CZPClientLogMgr::LogClientEnvironment] [MacAddr: 2C:4D:54:43:33:FF][client: Linux][OS: Armbian 20.11 Focal][Hardware: CPU Core:4 Frenquency:1.8 G Memory size:2001MB CPU Brand:              Intel(R) Pentium(R) 4 CPU 1800MHz GPU Brand:][Req ID: ]
Linux Client Version is 5.3.469451.0927
QSG_RENDER_LOOP is 
XDG_CURRENT_DESKTOP = XFCE;   GDMSESSION = xfce
Graphics Card Info:: 
Zoom package arch is 32bit, runing OS arch is i386
AppIconMgr::systemDesktopName log Desktop Name: xfce 
link image0 hasn't been detected!
Could not resolve property : pattern0
Error: Send error, 22 Invalid argument
Error: Send error, 22 Invalid argument
Error: Send error, 22 Invalid argument
Segmentation fault

trace.txt attached + it's last lines:

dlopen: New handle 0x1d (/home/q/soft/zoom/libturbojpeg.so), dlopened=1
Call to dlsym(0x1d, TJBUFSIZEYUV) :0x5af474a0
Call to dlsym(0x1d, TJBUFSIZE) :0x5af474e0
Call to dlsym(0x1d, tjInitDecompress) :0x5af484e0
Call to dlsym(0x1d, tjInitCompress) :0x5af482c0
Call to dlsym(0x1d, tjDecompressHeader2) :0x5af47890
Call to dlsym(0x1d, tjDecompressToYUV) :0x5af4dc00
Call to dlsym(0x1d, tjCompressFromYUVPlanes) :0x5af4b130
Call to dlsym(0x1d, tjDestroy) :0x5af4d290
Call to dlsym(0x1d, tjGetErrorStr) :0x5af46f30
7856|SIGSEGV @0x5914253c (???) (x86pc=0x5af77957//home/q/soft/zoom/libturbojpeg.so:"???"), for accessing (nil) (code=1), db=0x684a5408(0x5af77957/???)
7856|Double SIGSEGV!

I know, that may be hard to fix, but I can build box86 from source with patches for testing or get some addition log or backtrace.

q4a commented 3 years ago

Build new version was very easy, so I tested current git master - same error. Run command: $ BOX86_DLSYM_ERROR=1 BOX86_TRACE_FILE=trace3.txt /home/q/git/box86/build/box86 zoom trace3.txt attached + it's last lines:

Call to dlopen("/home/q/soft/zoom/libturbojpeg.so"/0xb684d8c0, 2)
dlopen: New handle 0x1d (/home/q/soft/zoom/libturbojpeg.so), dlopened=1
Call to dlsym(0x1d, TJBUFSIZEYUV) :0x772714a0
Call to dlsym(0x1d, TJBUFSIZE) :0x772714e0
Call to dlsym(0x1d, tjInitDecompress) :0x772724e0
Call to dlsym(0x1d, tjInitCompress) :0x772722c0
Call to dlsym(0x1d, tjDecompressHeader2) :0x77271890
Call to dlsym(0x1d, tjDecompressToYUV) :0x77277c00
Call to dlsym(0x1d, tjCompressFromYUVPlanes) :0x77275130
Call to dlsym(0x1d, tjDestroy) :0x77277290
Call to dlsym(0x1d, tjGetErrorStr) :0x77270f30
10850|SIGSEGV @0x72d5aaa4 (???(0x72d5aaa4)) (x86pc=0x772a1966//home/q/soft/zoom/libturbojpeg.so:"???", esp=0x776a6a15), for accessing (nil) (code=1), db=0x72dbbc60(0x72d5aa68:0x72d5ab48/0x772a1957:0x772a1976/???)
10850|Double SIGSEGV!
10850|Double SIGSEGV!
10850|Double SIGSEGV!
----------------------- many times -----------------------
10850|Double SIGSEGV!
10850|Double SIGSEGV!
10850|Double SIGSEGV!

10850|Double SIGSEGV! repeats endlessly and trace3.txt size was more 50 Mb, but I cut it.

ptitSeb commented 3 years ago

That libturbojpeg is fully emulated, so it's not an issue with a wrapper. box86 now default to no log, can you please do the trace with BOX86_LOG=1 BOX86_DLSYM_ERROR=1 BOX86_TRACE_FILE=trace3.txt /home/q/git/box86/build/box86 zoom to have a bit more info. What is the version of zoom you are using? On my copy? I have this md5sum: c6a510f396d8e3a0bc1e05f89d0684d9 libturbojpeg.so.0.1.0

q4a commented 3 years ago

Here is log running BOX86_LOG=1 BOX86_DLSYM_ERROR=1 BOX86_TRACE_FILE=trace3.txt /home/q/git/box86/build/box86 zoom: trace3.txt

libturbojpeg.so looks the same:

$ file libturbojpeg.so
libturbojpeg.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, BuildID[sha1]=d58e9fafde9ac5f753b26cf343b4bbbd9d9e6a75, not stripped
$ md5sum libturbojpeg.so
c6a510f396d8e3a0bc1e05f89d0684d9  libturbojpeg.so
ptitSeb commented 3 years ago

Ok, I'll see what I can find in that libturbojpeg.

ptitSeb commented 3 years ago

So I checked in the lib, and this x86pc makes no sense. I can understand it crashes (because of this offset!), but I don't understand why it's there. Can you run with BOX86_LOG=2 instead of 1. This will make the trace significatly larger, but hopefully I will get what function of that lib it tries to use, so I can check why it crashes.

q4a commented 3 years ago

I tried to run BOX86_LOG=2 BOX86_DLSYM_ERROR=1 BOX86_TRACE_FILE=trace4.txt /home/q/git/box86/build/box86 zoom few times, but zoom closes on login screen, not on on camera access. Anyway - here is log. May be there is another log options)

https://github.com/q4a/box86/releases/download/trace4.txt.tar.xz/trace4.txt.tar.xz - 12 Mb, but real txt is 304 Mb.

ptitSeb commented 3 years ago

Yeah, too much log, it slow down things too much and generate timeouts...

I have a solution to generate a log only whel libturbojpeg is loaded: you need to hack (temporarly) box86 sources. Edit src/wrapped/wrappedlibdl.c and in line 69, just after char* rfilename = (char*)filename; this line:

if(strstr(rfilename, "libturbojpeg.so")) box86_log=3;

This will put log level to max once libturbojpeg.so is asked for loading. So you can left BOX86_LOG=1 with this change, the log will get detailed just at the right time.

q4a commented 3 years ago

Looks like it helps) I got crash on camera access. Here is my trace5.txt and my console output (just in case):

$ BOX86_LOG=1 BOX86_DLSYM_ERROR=1 BOX86_TRACE_FILE=trace5.txt /home/q/git/box86/build/box86 zoom
Debug level is 1
BOX86 Trace redirected to "trace5.txt"
Box86 with Dynarec v0.1.7 3c166c43 built on Dec 14 2020 17:46:15
zoom started.
Client: Breakpad is using Single Client Mode! client fd = -1
[CZPClientLogMgr::LogClientEnvironment] [MacAddr: 2C:4D:54:43:33:FF][client: Linux][OS: Armbian 20.11 Focal][Hardware: CPU Core:4 Frenquency:1.8 G Memory size:2001MB CPU Brand:              Intel(R) Pentium(R) 4 CPU 1800MHz GPU Brand:][Req ID: ]
Linux Client Version is 5.3.469451.0927
QSG_RENDER_LOOP is 
XDG_CURRENT_DESKTOP = XFCE;   GDMSESSION = xfce
Graphics Card Info:: 
Zoom package arch is 32bit, runing OS arch is i386
Error: Send error, 22 Invalid argument
Error: Send error, 22 Invalid argument
Error: Send error, 22 Invalid argument
AppIconMgr::systemDesktopName log Desktop Name: xfce 
link image0 hasn't been detected!
Could not resolve property : pattern0
box86: pthread_mutex_lock.c:81: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.
Aborted
ptitSeb commented 3 years ago

Ok, I think I have enough informtion to try understand what is going on on my side. But for this issue, I'll do a "PLAN B": I'll wrap libturbojpeg.so.0 in box86, so the native ARM version will be used, that will help the performances. I'll push that change soon, hopefully it will make zoom work for your camera, and with not much loss on performances.

ptitSeb commented 3 years ago

(you may need to sudo apt install libturbojpeg0 to have that lib)

q4a commented 3 years ago

I installed libturbojpeg:

$ file /usr/lib/arm-linux-gnueabihf/libturbojpeg.so.0.2.0 
/usr/lib/arm-linux-gnueabihf/libturbojpeg.so.0.2.0: ELF 32-bit LSB shared object, ARM, EABI5 version 1 (SYSV), dynamically linked, BuildID[sha1]=3274393099a864b13ba7b3f8b1353125b576db05, stripped
$ file /usr/lib/arm-linux-gnueabihf/libturbojpeg.so.0
/usr/lib/arm-linux-gnueabihf/libturbojpeg.so.0: symbolic link to libturbojpeg.so.0.2.0

I builded last source code Box86 with Dynarec v0.1.7 bed0f24f built on Dec 14 2020 20:02:23 + if(strstr(rfilename, "libturbojpeg.so")) box86_log=3; and got line

Using emulated /home/q/soft/zoom/libturbojpeg.so

But still same error: trace6.txt

ptitSeb commented 3 years ago

Oh, yes, because if does dlopen("/home/q/soft/zoom/libturbojpeg.so", 2); in the program, specificaly defining the path, so it does load exactly this lib. I can do some workaround maybe, detecting if "zoom" is used to force the load of a generic version from the system in that case.

q4a commented 3 years ago

Thanks a lot for your time and support. Best wishes to you and Box86!

ptitSeb commented 3 years ago

Thanks :)

Note that you'll have to remove the trace hack this time, before applying next patch.

q4a commented 3 years ago

I'm not getting error on libturbojpeg.so, but just Segmentation fault on webcam access:

$ BOX86_LOG=1 BOX86_DLSYM_ERROR=1 BOX86_TRACE_FILE=trace7.txt /home/q/git/box86/build/box86 zoom
Debug level is 1
BOX86 Trace redirected to "trace7.txt"
Box86 with Dynarec v0.1.7 2004006a built on Dec 15 2020 03:09:30
zoom started.
Client: Breakpad is using Single Client Mode! client fd = -1
[CZPClientLogMgr::LogClientEnvironment] [MacAddr: 2C:4D:54:43:33:FF][client: Linux][OS: Armbian 20.11 Focal][Hardware: CPU Core:4 Frenquency:1.8 G Memory size:2001MB CPU Brand:              Intel(R) Pentium(R) 4 CPU 1800MHz GPU Brand:][Req ID: ]
Linux Client Version is 5.3.469451.0927
QSG_RENDER_LOOP is 
XDG_CURRENT_DESKTOP = XFCE;   GDMSESSION = xfce
Graphics Card Info:: 
Zoom package arch is 32bit, runing OS arch is i386
Error: Send error, 22 Invalid argument
Error: Send error, 22 Invalid argument
Error: Send error, 22 Invalid argument
AppIconMgr::systemDesktopName log Desktop Name: xfce 
link image0 hasn't been detected!
Could not resolve property : pattern0
Segmentation fault

trace7.txt

ptitSeb commented 3 years ago

This last commit should fix the Segfault. Hopefully it works now.

q4a commented 3 years ago

With Box86 with Dynarec v0.1.7 367df089 built on Dec 15 2020 09:06:28 webcam worked 3-4 seconds and I saw my hand) But then Segmentation fault: trace8.txt

ptitSeb commented 3 years ago

I don't see any typical box86 SIGSEGV message in the log. Is that normal?

q4a commented 3 years ago

I'm getting Segmentation fault in regular output:

$ BOX86_LOG=1 BOX86_DLSYM_ERROR=1 BOX86_TRACE_FILE=trace8.txt /home/q/git/box86/build/box86 zoom
Debug level is 1
BOX86 Trace redirected to "trace8.txt"
Box86 with Dynarec v0.1.7 367df089 built on Dec 15 2020 09:06:28
zoom started.
Client: Breakpad is using Single Client Mode! client fd = -1
[CZPClientLogMgr::LogClientEnvironment] [MacAddr: 2C:4D:54:43:33:FF][client: Linux][OS: Armbian 20.11.3 Focal][Hardware: CPU Core:4 Frenquency:1.8 G Memory size:2001MB CPU Brand:              Intel(R) Pentium(R) 4 CPU 1800MHz GPU Brand:][Req ID: ]
Linux Client Version is 5.3.469451.0927
QSG_RENDER_LOOP is 
XDG_CURRENT_DESKTOP = XFCE;   GDMSESSION = xfce
Graphics Card Info:: 
Zoom package arch is 32bit, runing OS arch is i386
Error: Send error, 22 Invalid argument
Error: Send error, 22 Invalid argument
Error: Send error, 22 Invalid argument
AppIconMgr::systemDesktopName log Desktop Name: xfce 
link image0 hasn't been detected!
Could not resolve property : pattern0
Segmentation fault

I'll check new version of zoom and test another webcam now.

ptitSeb commented 3 years ago

This look like an infinite call loop to have a Segfault like that. But I don't know were it could come from.

Can you try to run under gdb? use

gdb --args /home/q/git/box86/build/box86 zoom

(no need for the logs) then use r to run, let it crash gdb should gives you the current function / position were the crash occur. I'm already interrested in that then try a bt to get the backtrace, but if it's an infinite call loop, it may not be very usefull.. (use q to quit gdb)

q4a commented 3 years ago

May be I have both left hands, but gdb --args /home/q/git/box86/build/box86 ./zoom gives me nothing:

q@tinkerboard:~/soft/zoom$ gdb --args /home/q/git/box86/build/box86 ./zoom
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/q/git/box86/build/box86...
(gdb) r
Starting program: /home/q/git/box86/build/box86 ./zoom

But I can attach gdb to runned box86.

I got a lot of SIGSEGV in gdb in different child processes. There is 2 different behavior: 1 - When I use bt and then c on each SIGSEGV - I can't get webcam working. Here is log: box86-2.txt Here is screen record: https://www.youtube.com/watch?v=XfXEwgwtIo8

2 - When I use only c - webcam start working and then I used bt + c on each SIGSEGV. Here is log: box86-3.txt Here is screen record: https://www.youtube.com/watch?v=F9ZMzVTgwAU

Not sure, if that log helps, but I can't download last version of zoom - "Access Denied": - https://zoom.us/client/latest/zoom_i686.tar.xz

I write on Zoom Dev Forum and I hope, that it helps. https://devforum.zoom.us/t/cant-download-zoom-i686-tar-xz/38421

ptitSeb commented 3 years ago

Ok, seems enough. Looks like an issue with the pulse audio. I'll try to reproduce on my side.

ptitSeb commented 3 years ago

Ok, seems go now. Try on your side, but for me, it was all good.

q4a commented 3 years ago

I tested 3a5c6751 with this webcams: https://linux-hardware.org/index.php?id=usb:0458-707a https://linux-hardware.org/index.php?id=usb:046d-08ae

And both are working now. Thanks a lot!