status-im / status-desktop

Status Desktop client made in Nim & QML
https://status.app
Mozilla Public License 2.0
303 stars 79 forks source link

Application randomly crashes when adding / editing or deleting account in wallet screen #16645

Open anastasiyaig opened 4 weeks ago

anastasiyaig commented 4 weeks ago

Bug Report

Description

It looks like application shuts down itself when tests are adding / editing / removing accounts. It happens randomly. When that happens, test fails and after that - it cant proceed with retry, as it feels like the appImage is irresponsive for some time. So Squish opens the app again, resizes the window (means it is attached to application) but then it cant click anywhere and fails.

Steps to reproduce

Seems i cant reproduce manually, but autotests can replicate it easily. Tests in PRs are failing with meaningless Acknowledge checkbox was not checked error which is the result of the problem: app is dead for first test attempt and then its cant continue for some time. After some attempts, tests can continue running further

Log when this problem occurs:

data_attachments_88c7414071434e0a.txt

I would appreciate any help in debugging this issue as this breaks tests a lot. Also, maybe we can add some extra logging / output for nim calls

I was also able to reproduce on mac and it looks app crashes indeed:

-------------------------------------
Translated Report (Full Report Below)
-------------------------------------

Process:               nim_status_client [23369]
Path:                  /Users/USER/*/nim_status_client
Identifier:            nim_status_client
Version:               ???
Code Type:             ARM-64 (Native)
Parent Process:        Python [22955]
Responsible:           pycharm [6825]
User ID:               501

Date/Time:             2024-10-29 15:55:39.1620 +0300
OS Version:            macOS 14.6.1 (23G93)
Report Version:        12
Anonymous UUID:        9CDEBC3C-B14D-FE06-06AD-2B9BF95BB4F1

Sleep/Wake UUID:       F6BDC916-D8E2-4CBB-B465-DF382987E4B8

Time Awake Since Boot: 1300000 seconds
Time Since Wake:       107619 seconds

System Integrity Protection: enabled

Crashed Thread:        0  CrBrowserMain  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (SIGABRT)
Exception Codes:       KERN_INVALID_ADDRESS at 0x70617773696e7522 -> 0x00007773696e7522 (possible pointer authentication failure)
Exception Codes:       0x0000000000000001, 0x70617773696e7522

Termination Reason:    Namespace SIGNAL, Code 6 Abort trap: 6
Terminating Process:   nim_status_client [23369]

VM Region Info: 0x7773696e7522 is not in any region.  Bytes after previous region: 25783920653603  
      REGION TYPE                    START - END         [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      MALLOC_NANO              600006508000-600020000000 [411.0M] rw-/rwx SM=SHM  
--->  
      UNUSED SPACE AT END

Application Specific Information:
abort() called

Thread 0 Crashed:: CrBrowserMain Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib                 0x18f08d5f0 __pthread_kill + 8
1   libsystem_pthread.dylib                0x18f0c5c20 pthread_kill + 288
2   libsystem_c.dylib                      0x18efd2a30 abort + 180
3   libsquishhook.dylib                    0x16f81c654 crashHandler(int) + 108
4   libsystem_platform.dylib               0x18f0f6584 _sigtramp + 56
5   nim_status_client                      0x102ef66fc rawNewObj__system_6271 + 24 (gc.nim:458) [inlined]
6   nim_status_client                      0x102ef66fc newObjNoInit + 24 (gc.nim:482) [inlined]
7   nim_status_client                      0x102ef66fc rawNewStringNoInit + 140 (sysstr.nim:48) [inlined]
8   nim_status_client                      0x102ef66fc copyString + 180 (sysstr.nim:110)
9   nim_status_client                      0x102ef66fc rawNewObj__system_6271 + 24 (gc.nim:458) [inlined]
10  nim_status_client                      0x102ef66fc newObjNoInit + 24 (gc.nim:482) [inlined]
11  nim_status_client                      0x102ef66fc rawNewStringNoInit + 140 (sysstr.nim:48) [inlined]
12  nim_status_client                      0x102ef66fc copyString + 180 (sysstr.nim:110)
13  nim_status_client                      0x10312761c toChatDto__app95serviceZserviceZchatZdtoZchat_2166 + 1248 (json_utils.nim:40)
14  nim_status_client                      0x103129e0c toChatDto__app95serviceZserviceZchatZdtoZchat_2484 + 40 (chat.nim:311)
15  nim_status_client                      0x103131f2c toCommunityDto__app95serviceZserviceZcommunityZdtoZcommunity_3966 + 5680 (community.nim:488)
16  nim_status_client                      0x103166d54 fromEvent__appZcoreZsignalsZsignals95manager_1111 + 7732 (messages.nim:98)
17  nim_status_client                      0x1031783f8 decode__appZcoreZsignalsZsignals95manager_39 + 796 (signals_manager.nim:77)
18  nim_status_client                      0x103178a9c processSignal__appZcoreZsignalsZsignals95manager_70 + 1024 (signals_manager.nim:46)
19  nim_status_client                      0x102f683e4 qobjectCallback + 444 (qobject.nim:52)
20  nim_status_client                      0x10386d170 DOS::DosQObjectImpl::executeSlot(QString const&, std::__1::vector<QVariant, std::__1::allocator<QVariant>> const&) + 532
21  nim_status_client                      0x10386cd24 DOS::DosQObjectImpl::executeSlot(QMetaMethod const&, void**, int) + 372
22  nim_status_client                      0x10386c5f0 DOS::DosQObjectImpl::executeSlot(int, void**) + 100
23  nim_status_client                      0x10386c520 DOS::DosQObjectImpl::qt_metacall(QMetaObject::Call, int, void**) + 116
24  QtCore                                 0x10703e5ac QObject::event(QEvent*) + 596
25  QtCore                                 0x107017254 QCoreApplicationPrivate::notify_helper(QObject*, QEvent*) + 404
26  QtCore                                 0x107016d94 QCoreApplication::notifyInternal2(QObject*, QEvent*) + 292
27  QtCore                                 0x107017f80 QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) + 496
28  libqcocoa.dylib                        0x140a3b41c 0x140a00000 + 242716
29  libqcocoa.dylib                        0x140a3bccc 0x140a00000 + 244940
30  CoreFoundation                         0x18f1a54d8 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 28
31  CoreFoundation                         0x18f1a546c __CFRunLoopDoSource0 + 176
32  CoreFoundation                         0x18f1a51dc __CFRunLoopDoSources0 + 244
33  CoreFoundation                         0x18f1a3dc8 __CFRunLoopRun + 828
34  CoreFoundation                         0x18f1a3434 CFRunLoopRunSpecific + 608
35  HIToolbox                              0x19994d19c RunCurrentEventLoopInMode + 292
36  HIToolbox                              0x19994cfd8 ReceiveNextEventCommon + 648
37  HIToolbox                              0x19994cd30 _BlockUntilNextEventMatchingListInModeWithFilter + 76
38  AppKit                                 0x192a02cc8 _DPSNextEvent + 660
39  AppKit                                 0x1931f94d0 -[NSApplication(NSEventRouting) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 700
40  AppKit                                 0x1929f5ffc -[NSApplication run] + 476
41  libqcocoa.dylib                        0x140a3a9c8 0x140a00000 + 240072
42  QtCore                                 0x10701333c QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) + 544
43  QtCore                                 0x107017484 QCoreApplication::exec() + 132
44  nim_status_client                      0x1035d4928 mainProc__nim95status95client_1515 + 2448 (nim_status_client.nim:252)
45  nim_status_client                      0x10364d7d8 NimMain + 32 (io_interface.nim:874) [inlined]
46  nim_status_client                      0x10364d7d8 main + 68 (io_interface.nim:881)
47  dyld   

crash report.txt

Image

Additional Information

anastasiyaig commented 4 weeks ago

@jrainville @alexjba any help here is very welcome , to find the place where this crash is actually happening it happens more often on wallet screen (adding accounts or editing them) and it also happens (a bit rarely) in tests that are creating community / community channels

jrainville commented 4 weeks ago

This issue is very weird.

The crash happens when parsing the chats of the curated community, specifically the Status community. However, it only crashes if a wallet account was added prior to that. If I just open a new account and let the curated communities load while not doing anything, it doesn't crash.

It also doesn't crash on the same chat each time, so it doesn't seem related to any specific property. Sometimes it crashes when returning the ChatDto, sometimes when just manipulating the chatsObj.

It feels very similar to the other crash we had with spectating a community after leaving, where it crashed when returning. https://github.com/status-im/status-desktop/issues/15848 Looking at the stack trace, I see mentions of the curated communities, so it might be the same indeed.

I tried putting a try/catch around it and it doesn't catch it, it crashes anyway.

I commented the whole chat parsing part and that worked. I was wondering if maybe it would crash in a later parsing section if the issue was just a delayed panic, but no, it seems to be ok. However, we need that code block, so we can't just remove it.

I'm a bit at a loss for solutions. If it failed on a missing condition, it would be easy, but it crashes in different spots each times.

Can a property be garbage collected during the run time of a function? Since in both cases it seemed to happen after the DB was written to, could that be a clue? I don't see the connection, since the DB is in Go and the Nim code doesn't even access it, but it's the only thing linking those events.

anastasiyaig commented 4 weeks ago

maybe @igor-sirotin can look at it ? or @osmaczko

i also dont know how really possible to see this issue for real users but for tests its really a blocker it seems

jrainville commented 4 weeks ago

i also dont know how really possible to see this issue for real users but for tests its really a blocker it seems

Indeed, it seems very edgecase, since in this particular scenario, you need to have a new account, create a new wallet account, all that before the Status community loads.

Though we have historical proof that it happened before. Also, if it makes the tests flaky, it would be good to fix it

jrainville commented 4 weeks ago

For posterity, this is where the crash happens, though like I said, it happens in differences spots inside that code block: https://github.com/status-im/status-desktop/blob/ca1182e3f9f20f8ff345af42263063b86ecb9108/src/app_service/service/community/dto/community.nim#L486-L489