Popax21 / synaTudor

GNU Lesser General Public License v2.1
92 stars 10 forks source link

Exit Code 31 after suspend #6

Closed idoybh closed 2 years ago

idoybh commented 2 years ago

First of all thank you for your work on this. It works flawlessly most of the time. The only problem I'm facing is fprintd crashes after suspend with exit code 31:

fprintd[3191]: g_task_return_boolean: assertion 'G_IS_TASK (task)' failed
fprintd[3191]: Tudor host process died! Exit Code 31

Edit: Some of the times it also dies with:

fprintd[6835]: g_task_return_boolean: assertion 'G_IS_TASK (task)' failed
fprintd[6835]: Tudor host process died! Exit Code 159

Also some if the times with a coredump:

Process 9964 (fprintd) of user 0 dumped core.                     
Module linux-vdso.so.1 with build-id d3c210522f2d619f9e3c2d7196aaa6ec2037e14b
Module libgpg-error.so.0 with build-id 4738b8a9478177c202cccd64e0eb65d3dea2bfae
Module libatomic.so.1 with build-id 7c1907c77fdfa87da0a195b2bcb1db37802b590c
Module liblz4.so.1 with build-id e63600ab23b2f6997f42fac2fa56e1f02ce159a1
Module libzstd.so.1 with build-id ab54c2881f53ab314e134f3e08c76d504376dd5d
Module liblzma.so.5 with build-id 28b40c7af8098a66af6ee093b6986b91cad7694d
Module libgcrypt.so.20 with build-id 8bf3cb884124273640de797a3e77d86c98434ea4
Module libcap.so.2 with build-id 1f87347b85b55db2f75a2ecea5cb45d846dc7093
Module libudev.so.1 with build-id 2a20e10475325f65fa29687073270e417e20a984
Module libnspr4.so with build-id e54087b0e38cf8936f6c458e922589406343b05d
Module libplc4.so with build-id 2f3f90e73b1f4fc3424689615d6d32f7427bb783
Module libplds4.so with build-id 2b49ec2f43d2e5fc2e56f35bcd094968f9bf36e6
Module libnssutil3.so with build-id 36c34a5ab8de68047e7886fe56da7ccf6722f465
Module libusb-1.0.so.0 with build-id 98d3b69176fc359d2c574759c2064f83e5cdf8ed
Module libblkid.so.1 with build-id 140694a62d8d4d07c6c320a501f948dd1b389d73
Module libpthread.so.0 with build-id 95ae4f30a6f12ccbff645d30f8e1a3ee23ec7d36
Module ld-linux-x86-64.so.2 with build-id 0effd0e43efa4468d3c31871c93af0b7f3005673
Module libsystemd.so.0 with build-id 3e5d9eb54ba96616b0f90c9b897f04fd126527de
Module libgudev-1.0.so.0 with build-id 460465b63a086d945124662363191903c0002793
Module libnss3.so with build-id 87abb242025e6e243d6974b3a9da26166884533a
Module libpixman-1.so.0 with build-id d2170a3ac106c2a68597bf7910ab04b1cdd69c14
Module libm.so.6 with build-id 1b7296ef9fd806e47060788389293c824b09ad72
Module libgusb.so.2 with build-id 1697140c599d0977dc94f5777c7dba48d8c1b16d
Module libffi.so.8 with build-id f0a9586cf0f42d2b9971bd1065ca3a6b19f4a2c2
Module libmount.so.1 with build-id 4436aeea0cd8c01b5a77969e0531184f8b3513ce
Module libz.so.1 with build-id fefe3219a96d682ec98fcfb78866b8594298b5a2
Module libpcre.so.1 with build-id 845483dd0acba86de9f0313102bebbaf3ce52767
Module libc.so.6 with build-id 60df1df31f02a7b23da83e8ef923359885b81492
Module libgcc_s.so.1 with build-id 0e3de903950e35ae59a5de8c00b1817a4a71ca01
Module libpolkit-gobject-1.so.0 with build-id 42b81fbd4311562066a33daac95868a4b20105ae
Module libfprint-2.so.2 with build-id afac3f96db72d486626bc72992302a7914a81773
Module libgmodule-2.0.so.0 with build-id abd986222e2cf12fc7324cb0182dfc2c8f2269c0
Module libgobject-2.0.so.0 with build-id a7515bd8cd51064d187953c0f506a43958de31a6
Module libgio-2.0.so.0 with build-id 7a769ec24a9a705d04ee0297730032f70ed0835b
Module libglib-2.0.so.0 with build-id 1340f3a762b2293ebf6d725edf0eb14839f85317
Module fprintd with build-id 381105c0d1b405b659d94e00992e74bb37386a3e
Stack trace of thread 9964:
#0  0x00007fc9258f1df1 n/a (n/a + 0x0)
#1  0x00007fc9258f21af n/a (n/a + 0x0)
#2  0x00007fc922040ef5 n/a (libc.so.6 + 0x40ef5)
#3  0x00007fc922041070 exit (libc.so.6 + 0x41070)
#4  0x00007fc922029297 n/a (libc.so.6 + 0x29297)
#5  0x00007fc92202934a __libc_start_main (libc.so.6 + 0x2934a)
#6  0x000055abee77d595 _start (fprintd + 0x9595)

Stack trace of thread 9965:
#0  0x00007fc922105c3f __poll (libc.so.6 + 0x105c3f)
#1  0x00007fc925827f68 n/a (libglib-2.0.so.0 + 0xaaf68)
#2  0x00007fc9257cf392 g_main_context_iteration (libglib-2.0.so.0 + 0x52392)
#3  0x00007fc9257cf3e2 n/a (libglib-2.0.so.0 + 0x523e2)
#4  0x00007fc925801405 n/a (libglib-2.0.so.0 + 0x84405)
#5  0x00007fc92208c54d n/a (libc.so.6 + 0x8c54d)
#6  0x00007fc922111874 __clone (libc.so.6 + 0x111874)

Stack trace of thread 9966:
#0  0x00007fc922105c3f __poll (libc.so.6 + 0x105c3f)
#1  0x00007fc925827f68 n/a (libglib-2.0.so.0 + 0xaaf68)
#2  0x00007fc9257d11cf g_main_loop_run (libglib-2.0.so.0 + 0x541cf)
#3  0x00007fc9256baacc n/a (libgio-2.0.so.0 + 0x108acc)
#4  0x00007fc925801405 n/a (libglib-2.0.so.0 + 0x84405)
#5  0x00007fc92208c54d n/a (libc.so.6 + 0x8c54d)
#6  0x00007fc922111874 __clone (libc.so.6 + 0x111874)
ELF object binary architecture: AMD x86-64`

relevant file in fprintd: https://gitlab.freedesktop.org/libfprint/fprintd/-/blob/master/src/device.c#L492

Am on Arch Linux, KDE, X11, systemd. Kernel: 5.18.12-arch1-1 Compiled this on: fde866fb6b612b013b2ee7afc059f88125ee13d5 As for libfrint am using: libfprint-tod-git 1.94.3+tod1-1 from the AUR Do tell if any additional information is needed for this. Not sure if it's my DE's fault or the driver's. As a workaround I've made a systemd service to restart fprintd after resuming from suspend.

Popax21 commented 2 years ago

Can you send the tudor-host-launcher's log? Also, you might want to try to run sudo coredumpctl debug tudor_host and then bt to see a backtrace if it SEGFAULTed (might want to turn off the UNMOUNTFS build option for that).

idoybh commented 2 years ago

full since boot tudor-host-launcher log

The backtrace:

#0  0x00007fcf78105c3f in poll () from /usr/lib/libc.so.6
#1  0x00007fcf787f5757 in ?? () from /usr/lib/libusb-1.0.so.0
#2  0x00007fcf787f7438 in libusb_handle_events_timeout_completed () from /usr/lib/libusb-1.0.so.0
#3  0x00007fcf787f74b4 in libusb_handle_events () from /usr/lib/libusb-1.0.so.0
#4  0x000055c6cd005697 in usb_thread_func ()
#5  0x00007fcf7808c54d in ?? () from /usr/lib/libc.so.6
#6  0x00007fcf78111874 in clone () from /usr/lib/libc.so.6
idoybh commented 2 years ago

might want to turn off the UNMOUNTFS build option for that

And indeed on a build with UNMOUNTFS disabled it seems to only crash with exit code 31 so far. No more core dumps.

Popax21 commented 2 years ago

might want to turn off the UNMOUNTFS build option for that

And indeed on a build with UNMOUNTFS disabled it seems to only crash with exit code 31 so far. No more core dumps.

That only makes debugging easier, it doesn't change the program's behavior at all.

idoybh commented 2 years ago

That only makes debugging easier, it doesn't change the program's behavior at all.

Oh I see. Sorry. May be a fluke than.

Popax21 commented 2 years ago

Also, can you post the full coredumpctl output?

idoybh commented 2 years ago

Also, can you post the full coredumpctl output?

Module linux-vdso.so.1 with build-id d3c210522f2d619f9e3c2d7196aaa6ec2037e14b
Module libgcc_s.so.1 with build-id 0e3de903950e35ae59a5de8c00b1817a4a71ca01
Module libatomic.so.1 with build-id 7c1907c77fdfa87da0a195b2bcb1db37802b590c
Module libudev.so.1 with build-id 2a20e10475325f65fa29687073270e417e20a984
Module ld-linux-x86-64.so.2 with build-id 0effd0e43efa4468d3c31871c93af0b7f3005673
Module libc.so.6 with build-id 60df1df31f02a7b23da83e8ef923359885b81492
Module libseccomp.so.2 with build-id 898097fd5a5d3469526739a834457be46bebedc4
Module libcap.so.2 with build-id 1f87347b85b55db2f75a2ecea5cb45d846dc7093
Module libusb-1.0.so.0 with build-id 98d3b69176fc359d2c574759c2064f83e5cdf8ed
Module libcrypto.so.1.1 with build-id 7981ea3d69f3c28e46ee312a815af96eab93775c
Module libtudor.so with build-id a5f875fe5c8fae3d1c1a391de8c8fe9c1edef6a6
Module tudor_host with build-id 63a86633e8a022420c585ac3c4460b64912f7199
Stack trace of thread 2329:
#0  0x00007fbc26105c3f __poll (libc.so.6 + 0x105c3f)
#1  0x00007fbc2687e757 n/a (libusb-1.0.so.0 + 0x10757)
#2  0x00007fbc26880438 libusb_handle_events_timeout_completed (libusb-1.0.so.0 + 0x12438)
#3  0x00007fbc268804b4 libusb_handle_events (libusb-1.0.so.0 + 0x124b4)
#4  0x00005633de740667 usb_thread_func (tudor_host + 0x2667)
#5  0x00007fbc2608c54d n/a (libc.so.6 + 0x8c54d)
#6  0x00007fbc26111874 __clone (libc.so.6 + 0x111874)

Stack trace of thread 2331:
#0  0x00007fbc26089119 n/a (libc.so.6 + 0x89119)
#1  0x00007fbc2608b920 pthread_cond_wait (libc.so.6 + 0x8b920)
#2  0x00007fbc268e562d evt_wait (libtudor.so + 0x1762d)
#3  0x00007fbc268e4913 win_wait_sync_obj (libtudor.so + 0x16913)
#4  0x00007fbc268e6164 WaitForSingleObject (libtudor.so + 0x18164)
#5  0x00007fbc262e5cf0 n/a (n/a + 0x0)
#6  0x00007fbc2608c54d n/a (libc.so.6 + 0x8c54d)
#7  0x00007fbc26111874 __clone (libc.so.6 + 0x111874)

Stack trace of thread 2328:
#0  0x00007fbc261135e7 recvmsg (libc.so.6 + 0x1135e7)
#1  0x00005633de745077 ipc_peek_msg (tudor_host + 0x7077)
#2  0x00005633de747393 run_handler_loop (tudor_host + 0x9393)
#3  0x00005633de741619 main (tudor_host + 0x3619)
#4  0x00007fbc26029290 n/a (libc.so.6 + 0x29290)
#5  0x00007fbc2602934a __libc_start_main (libc.so.6 + 0x2934a)
#6  0x00005633de740575 _start (tudor_host + 0x2575)

Stack trace of thread 2333:
#0  0x00007fbc26089119 n/a (libc.so.6 + 0x89119)
#1  0x00007fbc2608b920 pthread_cond_wait (libc.so.6 + 0x8b920)
#2  0x00007fbc268e7d07 ConnectNamedPipe (libtudor.so + 0x19d07)
#3  0x00007fbc262ae500 n/a (n/a + 0x0)
#4  0x00005633ded13894 n/a (n/a + 0x0)
#5  0xc1a5195688a5ec35 n/a (n/a + 0x0)
ELF object binary architecture: AMD x86-64

I can't reproduce the coredump reliably though. I've rebooted. will go suspend resume many times and share full logs

idoybh commented 2 years ago

fprintd.log tudor.log

the last coredump:

Module linux-vdso.so.1 with build-id d3c210522f2d619f9e3c2d7196aaa6ec2037e14b
Module libgcc_s.so.1 with build-id 0e3de903950e35ae59a5de8c00b1817a4a71ca01
Module libatomic.so.1 with build-id 7c1907c77fdfa87da0a195b2bcb1db37802b590c
Module libudev.so.1 with build-id 2a20e10475325f65fa29687073270e417e20a984
Module ld-linux-x86-64.so.2 with build-id 0effd0e43efa4468d3c31871c93af0b7f3005673
Module libc.so.6 with build-id 60df1df31f02a7b23da83e8ef923359885b81492
Module libseccomp.so.2 with build-id 898097fd5a5d3469526739a834457be46bebedc4
Module libcap.so.2 with build-id 1f87347b85b55db2f75a2ecea5cb45d846dc7093
Module libusb-1.0.so.0 with build-id 98d3b69176fc359d2c574759c2064f83e5cdf8ed
Module libcrypto.so.1.1 with build-id 7981ea3d69f3c28e46ee312a815af96eab93775c
Module libtudor.so with build-id a5f875fe5c8fae3d1c1a391de8c8fe9c1edef6a6
Module tudor_host with build-id 63a86633e8a022420c585ac3c4460b64912f7199
Stack trace of thread 3995:
#0  0x00007fdff2505c3f __poll (libc.so.6 + 0x105c3f)
#1  0x00007fdff2ca8757 n/a (libusb-1.0.so.0 + 0x10757)
#2  0x00007fdff2caa438 libusb_handle_events_timeout_completed (libusb-1.0.so.0 + 0x12438)
#3  0x00007fdff2caa4b4 libusb_handle_events (libusb-1.0.so.0 + 0x124b4)
#4  0x000055a686d05667 usb_thread_func (tudor_host + 0x2667)
#5  0x00007fdff248c54d n/a (libc.so.6 + 0x8c54d)
#6  0x00007fdff2511874 __clone (libc.so.6 + 0x111874)

Stack trace of thread 3994:
#0  0x00007fdff25135e7 recvmsg (libc.so.6 + 0x1135e7)
#1  0x000055a686d0a077 ipc_peek_msg (tudor_host + 0x7077)
#2  0x000055a686d0c393 run_handler_loop (tudor_host + 0x9393)
#3  0x000055a686d06619 main (tudor_host + 0x3619)
#4  0x00007fdff2429290 n/a (libc.so.6 + 0x29290)
#5  0x00007fdff242934a __libc_start_main (libc.so.6 + 0x2934a)
#6  0x000055a686d05575 _start (tudor_host + 0x2575)

Stack trace of thread 3997:
#0  0x00007fdff2489119 n/a (libc.so.6 + 0x89119)
#1  0x00007fdff248b920 pthread_cond_wait (libc.so.6 + 0x8b920)
#2  0x00007fdff2d0f62d evt_wait (libtudor.so + 0x1762d)
#3  0x00007fdff2d0e913 win_wait_sync_obj (libtudor.so + 0x16913)
#4  0x00007fdff2d10164 WaitForSingleObject (libtudor.so + 0x18164)
#5  0x00007fdff26e5cf0 n/a (n/a + 0x0)
#6  0x00007fdff248c54d n/a (libc.so.6 + 0x8c54d)
#7  0x00007fdff2511874 __clone (libc.so.6 + 0x111874)

Stack trace of thread 3999:
#0  0x00007fdff2489119 n/a (libc.so.6 + 0x89119)
#1  0x00007fdff248b920 pthread_cond_wait (libc.so.6 + 0x8b920)
#2  0x00007fdff2d11d07 ConnectNamedPipe (libtudor.so + 0x19d07)
#3  0x00007fdff26ae500 n/a (n/a + 0x0)
#4  0x000055a688b8a894 n/a (n/a + 0x0)
#5  0xc1a5195688a5ec35 n/a (n/a + 0x0)
ELF object binary architecture: AMD x86-64

backtrace:

#0  0x00007fdff2505c3f in poll () from /usr/lib/libc.so.6
#1  0x00007fdff2ca8757 in ?? () from /usr/lib/libusb-1.0.so.0
#2  0x00007fdff2caa438 in libusb_handle_events_timeout_completed () from /usr/lib/libusb-1.0.so.0
#3  0x00007fdff2caa4b4 in libusb_handle_events () from /usr/lib/libusb-1.0.so.0
#4  0x000055a686d05667 in usb_thread_func ()
#5  0x00007fdff248c54d in ?? () from /usr/lib/libc.so.6
#6  0x00007fdff2511874 in clone () from /usr/lib/libc.so.6
Popax21 commented 2 years ago

Sorry for not clarifying, but there should be a small bit at the start containing info like e.g. what caused the core dump.

idoybh commented 2 years ago

Sorry for not clarifying, but there should be a small bit at the start containing info like e.g. what caused the core dump.

Do you mean in coredumpctl or systemd logs?

Popax21 commented 2 years ago

Sorry for not clarifying, but there should be a small bit at the start containing info like e.g. what caused the core dump.

Do you mean in coredumpctl or systemd logs?

In coredumpctl.

idoybh commented 2 years ago

In coredumpctl.

here is the full output: https://katb.in/dikiluqesiv

idoybh commented 2 years ago

After recompiling today I get:

systemd[1]: Starting Tudor host launcher DBus service...
systemd[1]: Started Tudor host launcher DBus service.
tudor_host_launcher[5013]: [INF] Activated sandbox
tudor_host_launcher[5013]: [INF] Received init message - USB device 1-2
tudor_host_launcher[5013]: [INF] Initialized libcrypto
tudor_host_launcher[5013]: [INF] Initialized libusb
tudor_host_launcher[5013]: [WRN] PE file contains unsupported resource data directory!
tudor_host_launcher[5013]: [WRN] PE file contains unsupported exception data directory!
tudor_host_launcher[5013]: [INF] Loaded driver DLL 'synaFpAdapter104.dll' [186656 bytes]
tudor_host_launcher[5013]: [WRN] PE file contains unsupported resource data directory!
tudor_host_launcher[5013]: [WRN] PE file contains unsupported exception data directory!
tudor_host_launcher[5013]: [WRN] Data directory 4 has invalid bounds! [end 0x17ebe0 > image end 0x17e000]
tudor_host_launcher[5013]: [INF] Loaded driver DLL 'synaWudfBioUsb104.dll' [1567712 bytes]
tudor_host_launcher[5013]: [INF] Initializing driver DLL 'synaFpAdapter104.dll'...
tudor_host_launcher[5013]: [INF] Initializing driver DLL 'synaWudfBioUsb104.dll'...
tudor_host_launcher[5013]: [INF] Initialized tudor driver
tudor_host_launcher[5013]: [INF] Opened USB device
tudor_host_launcher[5013]: [WRN] GetModuleHandleExW called with unsupported flag GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS! [addr=0x7f907c0ed710]
tudor_host_launcher[5013]: [INF] Getting pairing data for sensor '1D5B87FE505E0000'...
tudor_host_laun[4911]: g_input_stream_read_all: assertion 'buffer != NULL' failed
tudor_host_laun[4911]: g_dbus_method_invocation_return_gerror: assertion 'error != NULL' failed

Should I open up a new issue? Saw someone previously had a similar log. Could it be the dll's changed remotely (noticed they were being downloaded via a shell script upon build time)

Popax21 commented 2 years ago

Just pushed a few more fixes, can you try again?

idoybh commented 2 years ago

Just pushed a few more fixes, can you try again?

recompiled, now I get: fprintd.log tudor.log coredump.log

I guess at this free?

idoybh commented 2 years ago

Yes, probably (more like certainly) not a proper solution ;) but commenting out this free fixes the thing. Will update for the originally reported bug soon.

EDIT: This was definitely something wrong in my env. Reinstalled kernel and it's fine now without commenting out the free

idoybh commented 2 years ago

Yes. I still have the exit code 31 error after resuming from suspend. Here are my logs again: fprintd.log tudor.log coredump.log Please ignore previous errors that were probably included before commenting out the free.

Popax21 commented 2 years ago

I just pushed a fix for the invalid free. Also, could you execute x/40i $rip-0x20 in the GDB shell which opens after executing coredumpctl debug tudor_host after the host coredumps?.

idoybh commented 2 years ago

I just pushed a fix for the invalid free. Also, could you execute x/40i $rip-0x20 in the GDB shell which opens after executing coredumpctl debug tudor_host after the host coredumps?.

there you go. recompiled and reproduced just now: fprintd.log tudor.log coredump.log

Popax21 commented 2 years ago

Thanks. I experimented with explicitly adding some code which handles suspending, can you try installing the experimental suspend branch? Note that actions right after waking up from suspend will hang for some time while the sensor is reinitialized (sadly there's no GUI to indicate that, and it might take up to 10-20 seconds).

idoybh commented 2 years ago

Thanks. I experimented with explicitly adding some code which handles suspending, can you try installing the experimental suspend branch? Note that actions right after waking up from suspend will hang for some time while the sensor is reinitialized (sadly there's no GUI to indicate that, and it might take up to 10-20 seconds).

Yes. Seems like it's not crashing anymore. Also it takes about 5s~ to init here (not bad at all considering my previous solution of restarting fprintd on resume took more time). Side note: Does the "put your finger on the sensor" string I see provided by the pam module? Wonder if the driver could tell it it's not ready yet somehow.

If you're happy with your own solution, you can close this issue imo. Thank you so much for this driver and everything. Truly a life saver

Popax21 commented 2 years ago

Thanks. I experimented with explicitly adding some code which handles suspending, can you try installing the experimental suspend branch? Note that actions right after waking up from suspend will hang for some time while the sensor is reinitialized (sadly there's no GUI to indicate that, and it might take up to 10-20 seconds).

Yes. Seems like it's not crashing anymore. Also it takes about 5s~ to init here (not bad at all considering my previous solution of restarting fprintd on resume took more time). Side note: Does the "put your finger on the sensor" string I see provided by the pam module? Wonder if the driver could tell it it's not ready yet somehow.

If you're happy with your own solution, you can close this issue imo. Thank you so much for this driver and everything. Truly a life saver

I tried to make some architectural changes which should now result in that string only appearing once the driver is ready.

idoybh commented 2 years ago

I tried to make some architectural changes which should now result in that string only appearing once the driver is ready.

Yes. These seem to work just right as well. As soon as the string appears the sensor works.

Popax21 commented 2 years ago

Great. I assume this issue can be closed now?

idoybh commented 2 years ago

Great. I assume this issue can be closed now?

Yea. Will report if anything else happens or if it fails again some day. Thanks again.