spesmilo / electrum

Electrum Bitcoin Wallet
https://electrum.org
MIT License
7.24k stars 3.03k forks source link

Crash on exit (PyQt, Qt, abnormal termination, quit, segfault, SIGSEGV) #6889

Open gits7r opened 3 years ago

gits7r commented 3 years ago

On 4.0.9 (but this was also present in previous versions) if you normally close Electrum (both stand alone and installed version) on this OS you will get an abnormal crash.

OS: Windows 7 Ultimate 32 bit (updates to date) It's quite easy to reproduce, just normally start electrum , leave it on for few minutes and then normally try to close it from the close button. It will immediately pop-up this (in 8/10 cases):

ele2

Here is the full dump:

roblem Event Name:  BEX
  Application Name: electrum-4.0.9.exe
  Application Version:  0.0.0.0
  Application Timestamp:    00000000
  Fault Module Name:    StackHash_0a9e
  Fault Module Version: 0.0.0.0
  Fault Module Timestamp:   00000000
  Exception Offset: 1e175f42
  Exception Code:   c0000005
  Exception Data:   00000008
  OS Version:   6.1.7601.2.1.0.256.1
  Locale ID:    1033
  Additional Information 1: 0a9e
  Additional Information 2: 0a9e372d3b4ad19135b953a78882e789
  Additional Information 3: 0a9e
  Additional Information 4: 0a9e372d3b4ad19135b953a78882e789

Read our privacy statement online:
  http://go.microsoft.com/fwlink/?linkid=104288&clcid=0x0409

If the online privacy statement is not available, please read our privacy statement offline:
  C:\Windows\system32\en-US\erofflps.txt
shsmith commented 3 years ago

I see the same crash on Windows 10 (64 bit) with 32GB of ram. Definitely not due to "low ram". Notice the StackHash in the detail. That is an indication there was a stack based buffer overflow at some time since the process started.

Source

electrum-4.0.9.exe

Summary Stopped working

Date ‎1/‎11/‎2021 8:53 AM

Status Report sent

Description Faulting Application Path: C:\Program Files (x86)\Electrum\electrum-4.0.9.exe

Problem signature Problem Event Name: BEX Application Name: electrum-4.0.9.exe Application Version: 0.0.0.0 Application Timestamp: 00000000 Fault Module Name: StackHash_2beb Fault Module Version: 0.0.0.0 Fault Module Timestamp: 00000000 Exception Offset: PCH_B1_FROM_ntdll+0x00071BDC Exception Code: c0000005 Exception Data: 00000008 OS Version: 10.0.19042.2.0.0.256.48 Locale ID: 1033 Additional Information 1: 2beb Additional Information 2: 2beba6fb4680d73a8c78ca7c24ccdb46 Additional Information 3: f00e Additional Information 4: f00e6a5b10db5b51c8340a277c5e9536

Extra information about the problem Bucket ID: 7802431f49a728c3fd4ad31246fbcfa6 (2110731450830278566)

random652 commented 3 years ago

42 days later, and still no reply, no fix, no update. Splendid application, this Electrum.

If this is truly due to a stack buffer overflow, that should be concerning to Electrum users. https://en.wikipedia.org/wiki/Stack_buffer_overflow#Exploiting_stack_buffer_overflows

Developers with Github write access: Thomas Voegtlin ecdsa https://github.com/ecdsa

Any ideas? You seem prepared to read the posts, and mark them as "closed" and spend the time to weed out duplicate questions, yet the actual problem goes compeletely unresolved.

SomberNight commented 3 years ago

@random652 You are more than welcome to help fix this issue. Electrum is a FLOSS project with limited resources and contributors spend time on the issues/features they care about.

Besides, despite what you are implying, I am very sceptical of there being security implications of this issue under any reasonable threat model. Please, explain why you think otherwise.

btw I tried to reproduce this a while ago and while it happened a few times it was actually very rare, much more so than OP mentions ("8/10"). It was too rare to be able to reliably narrow down the cause without spending significant amounts of time.

SomberNight commented 3 years ago

Looks like this is not Windows specific (https://github.com/spesmilo/electrum/issues/7012)

SomberNight commented 3 years ago

Any hints how to reproduce more reliably? E.g. does it happen more often with

gits7r commented 3 years ago
ecdsa commented 3 years ago

Looks like this is not Windows specific (#7012)

I too occasionally see crashes on exit with the Qt GUI, on Linux. I have never found any correlation with a particular usage pattern, except maybe that it tends to occur more if the UI has been open for a long time.

af7567 commented 3 years ago

I too occasionally see crashes on exit with the Qt GUI, on Linux. I have never found any correlation with a particular usage pattern, except maybe that it tends to occur more if the UI has been open for a long time.

I know this is going to confuse things even more, but for me it actually seems to happen more often if I quit Electrum after it has only been open for a few seconds. I am still closing it after the initial sync so I don't think it is because it's in the middle of connecting to servers though.

SomberNight commented 3 years ago

I have a trace from gdb, on Ubuntu 20.04. PyQt-related packages installed from apt. Unfortunately I am not sure if it is useful...

I | logging | Python version: 3.8.5 (default, Jan 27 2021, 15:41:15) 
[GCC 9.3.0]. On platform: Linux-5.8.0-48-generic-x86_64-with-glibc2.29
I | gui.qt.ElectrumGui | Qt GUI starting up... Qt=5.12.8, PyQt=5.14.1
python3-pyqt5 is already the newest version (5.14.1+dfsg-3build1).
libqt5core5a is already the newest version (5.12.8+dfsg-0ubuntu1).
python3-sip is already the newest version (4.19.21+dfsg-1build1).
I | daemon.Daemon | removing lockfile
I | daemon.Daemon | stopped
[Thread 0x7ffff475f700 (LWP 6820) exited]
I/p | plugin.Plugins | stopped
[Thread 0x7ffff3f5e700 (LWP 6821) exited]
--Type <RET> for more, q to quit, c to continue without paging--

Thread 1 "python3" received signal SIGSEGV, Segmentation fault.
0x00007ffff11c9192 in sip_api_get_address (w=0x7fffd80c6c10) at ./siplib/siplib.c:9141
9141    ./siplib/siplib.c: No such file or directory.
(gdb) bt
#0  0x00007ffff11c9192 in sip_api_get_address (w=0x7fffd80c6c10) at ./siplib/siplib.c:9141
#1  0x00007ffff108c355 in cleanup_qobject(sipSimpleWrapper*, void*)
    (sw=0x7fffd80c6c10, closure=0x142e9f0) at ../../qpy/QtCore/qpycore_public_api.cpp:41
#2  0x00007ffff11c9b60 in sip_api_visit_wrappers
    (visitor=0x7ffff108c330 <cleanup_qobject(sipSimpleWrapper*, void*)>, closure=0x142e9f0)
    at ./siplib/siplib.c:14290
#3  0x00007ffff108c0a0 in cleanup_on_exit(PyObject*, PyObject*) ()
    at ../../qpy/QtCore/qpycore_init.cpp:37
#4  0x00000000005c38e6 in cfunction_vectorcall_NOARGS
    (func=<built-in function _qtcore_cleanup>, args=<optimised out>, nargsf=<optimised out>, kwnames=<optimised out>) at ../Objects/methodobject.c:459
#5  0x00000000005f2b87 in PyVectorcall_Call
    (kwargs=<optimised out>, tuple=<optimised out>, callable=<built-in function _qtcore_cleanup>)
    at ../Objects/dictobject.c:1753
#6  PyObject_Call
    (callable=<built-in function _qtcore_cleanup>, args=<optimised out>, kwargs=<optimised out>)
    at ../Objects/call.c:227
#7  0x0000000000651884 in atexit_callfuncs (module=<optimised out>) at ../Modules/atexitmodule.c:87
#8  0x000000000067eab8 in call_py_exitfuncs (istate=0x961590, istate=0x961590)
    at ../Python/pylifecycle.c:2236
#9  Py_FinalizeEx () at ../Python/pylifecycle.c:1183
#10 0x000000000067cbcc in Py_Exit (sts=0) at ../Python/pylifecycle.c:2295
#11 0x000000000067cbfb in handle_system_exit () at ../Python/pythonrun.c:658
--Type <RET> for more, q to quit, c to continue without paging--
#12 0x000000000067ce26 in _PyErr_PrintEx (set_sys_last_vars=1, tstate=0x962210)
    at ../Python/pythonrun.c:763
#13 PyErr_PrintEx (set_sys_last_vars=1) at ../Python/pythonrun.c:763
#14 0x000000000067da9c in PyErr_Print () at ../Python/pythonrun.c:769
#15 PyRun_SimpleFileExFlags
    (fp=<optimised out>, filename=<optimised out>, closeit=<optimised out>, flags=0x7fffffffdea8)
    at ../Python/pythonrun.c:434
#16 0x00000000006b6132 in pymain_run_file (cf=0x7fffffffdea8, config=0x961640) at ../Modules/main.c:381
#17 pymain_run_python (exitcode=0x7fffffffdea0) at ../Modules/main.c:606
#18 Py_RunMain () at ../Modules/main.c:685
#19 0x00000000006b64bd in Py_BytesMain (argc=<optimised out>, argv=<optimised out>)
    at ../Modules/main.c:739
#20 0x00007ffff7de40b3 in __libc_start_main (main=
    0x4eec80 <main>, argc=4, argv=0x7fffffffe088, init=<optimised out>, fini=<optimised out>, rtld_fini=<optimised out>, stack_end=0x7fffffffe078) at ../csu/libc-start.c:308
#21 0x00000000005f927e in _start () at ../Objects/bytesobject.c:2560
(gdb) py-bt
Traceback (most recent call first):
  <built-in function _qtcore_cleanup>
(gdb) 

Similar traces:

SomberNight commented 3 years ago

Lots of potentially useful info at https://github.com/borgbase/vorta/issues/456 ...

Also see PyQt docs "crashes-on-exit":

Crashes On Exit

When the Python interpreter leaves a scope (for example when it returns from a function) it will potentially garbage collect all objects local to that scope. The order in which it is done is, in effect, random. Theoretically this can cause problems because it may mean that the C++ destructors of any wrapped Qt instances are called in an order that Qt isn’t expecting and may result in a crash. However, in practice, this is only likely to be a problem when the application is terminating.

As a way of mitigating this possiblity PyQt5 ensures that the C++ destructors of any QObject instances owned by Python are invoked before the destructor of any QCoreApplication instance is invoked. Note however that the order in which the QObject destructors are invoked is still random.


btw the segfault seems somewhat more likely on an Ubuntu 20.04 VM than on a Win10-20H2 VM, and also maybe more likely when running python in dev mode, so e.g.:

$ python3 -X dev -X tracemalloc ./run_electrum -v --testnet
SomberNight commented 3 years ago

I've made some fixes in commits: https://github.com/spesmilo/electrum/commit/34e64ef152cc433722f460fdb203ce5135be9f6d https://github.com/spesmilo/electrum/commit/be43632cc47eb8cc35c2a39a21bc2089e5ce8ca1 https://github.com/spesmilo/electrum/commit/981381346f7dc095f6c0df4150ddc936c9d2884f

And now I can no longer reproduce the segfault :) I think it might either be fixed or have become extremely unlikely.

SomberNight commented 3 years ago

Ah well I just got a segfault on Windows; so the issue is still not fixed. It has become rarer though, but that also means it is harder to reproduce when debugging.

ecdsa commented 3 years ago

I still get them on linux too

buhtz commented 2 weeks ago

Just FYI. At the Back In Time project we do have similar problems since years and still don't really know what is going on.

Today somebody discovered in bit-team/backintime#1095 a new hot trail. In our case an EventFilter is involved. If we explicit call removeEventFilter() on it in the closeEvent() handler of the main window we are not able anymore to reproduce the seg fault. It might not be a final solution but maybe it will help others or lead the Pyt/Qt core devs to a more general solution.