status-im / status-desktop

Status Desktop client made in Nim & QML
https://status.app
Mozilla Public License 2.0
270 stars 75 forks source link

Can't launch Status app on Intel-based Mac #15134

Open qoqobolo opened 2 weeks ago

qoqobolo commented 2 weeks ago

Bug Report

Description

Starting with RC 10 https://github.com/status-im/status-desktop/releases/tag/2.29.0-rc.10 I started experiencing a crash when launching the Status app. The app crashes even before opening the login screen.

https://github.com/status-im/status-desktop/assets/67952253/3e6ff40f-3d48-4839-aef0-c9b66afdc1f6

Logs: desktop_rc11.log

I first faced this when I tried to open a build with my main user (community status member). Then I tried to install yesterday’s nightly and run it with another test user, who also had contacts and community. The app crashed in both cases.

Then I tried deleting the data folder and launching the application. The first 5 attempts the crash repeated, but then I was able to launch the app (without changing anything) and create a user. After that, I could no longer log in to it.

Also, after creating another new user, I tried to open the QR code to sync with a mobile device, but I got a crash when opening the QR code. Here are the logs for this case:

DBG 2024-06-10 16:40:34.718+02:00 NewBE_callPrivateRPC                       topics="rpc" tid=149156 file=core.nim:27 rpc_method=wakuext_getOurInstallations
DBG 2024-06-10 16:40:37.262+02:00 NewBE_callPrivateRPC                       topics="rpc" tid=149027 file=core.nim:27 rpc_method=wakuext_speedupArchivesImport
DBG 2024-06-10 16:40:39.758+02:00 NewBE_callPrivateRPC                       topics="rpc" tid=149027 file=core.nim:27 rpc_method=wakuext_slowdownArchivesImport
DBG 2024-06-10 16:40:39.772+02:00 NewBE_callPrivateRPC                       topics="rpc" tid=149027 file=core.nim:27 rpc_method=accounts_verifyPassword
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
zsh: segmentation fault  ./nim_status_client --datadir=~/Downloads/master/custom --LOG_LEVEL=DEBUG

My macOS updated yesterday to Sonoma 14.5, maybe this could be the reason? Or I am doing something wrong. We couldn't reproduce this on any other Mac.

Additional Information

anastasiyaig commented 2 weeks ago

we need to get someone with Intel chip on mac to debug what is going on @jrainville @iurimatias

MishkaRogachev commented 2 weeks ago

Tried to reproduce on M1 Pro:

saledjenic commented 2 weeks ago

I was able to reproduce this crash on the app start (got exactly the same stacktrace), it has nothing with the specific Status data folder the app crashes because of the changed Go version. It has something with CGo bindings.

rc2.29 commit 441a802948d5a881b38de98dbf63408a1b893e6a (which is one before we changed Go version to 1.21) works fine.

Also if I change just the local Go version to 1.21 and try the same commit, getting this:

# github.com/anacrolix/go-libutp
vendor/github.com/anacrolix/go-libutp/callbacks.go:16:10: cannot define new methods on non-local type *C.utp_callback_arguments
vendor/github.com/anacrolix/go-libutp/callbacks.go:24:10: cannot define new methods on non-local type *C.utp_callback_arguments
vendor/github.com/anacrolix/go-libutp/callbacks.go:28:10: cannot define new methods on non-local type *C.utp_callback_arguments
vendor/github.com/anacrolix/go-libutp/callbacks.go:32:10: cannot define new methods on non-local type *C.utp_callback_arguments
vendor/github.com/anacrolix/go-libutp/callbacks.go:36:10: cannot define new methods on non-local type *C.utp_callback_arguments
vendor/github.com/anacrolix/go-libutp/utp.go:29:12: cannot define new methods on non-local type *C.utp_context
vendor/github.com/anacrolix/go-libutp/utp.go:40:12: cannot define new methods on non-local type *C.utp_context
make[1]: *** [statusgo-shared-library] Error 1
make: *** [vendor/status-go/build/bin/libstatus.dylib] Error 2

At this moment I am not pretty sure how to make it work on Mac Intel, maybe somebody more experienced in this field may help, otherwise I will most likely need much more time till I figure it out.

igor-sirotin commented 2 weeks ago

I was able to reproduce this crash on the app start (got exactly the same stacktrace)

@saledjenic just to make it clear, this is on Intel mac only, right?

igor-sirotin commented 2 weeks ago

Also if I change just the local Go version to 1.21 and try the same commit, getting this

This anacrolix/go-libutp errors you'll get if try to compile it with Go 1.21 without our code change. Andrea changed some build args for it to work.

igor-sirotin commented 2 weeks ago

cc @siddarthkay seems that we have a problem running status-desktop after on Intel MacOS after upgrading Go to 1.21

saledjenic commented 2 weeks ago

@saledjenic just to make it clear, this is on Intel mac only, right?

Correct.

This anacrolix/go-libutp errors you'll get if try to compile it with Go 1.21 without our code change. Andrea changed some build args for it to work.

This is what I see when I try to run the rc2.29 commit 441a802948d5a881b38de98dbf63408a1b893e6a with Go1.21, without any additional changes, it runs fine with Go1.20.

The commit after it introduces the official changes that we did to switch to Go1.21, when I run it then I see the same stacktrace as in desktop_rc11.log posted in the description of this issue.

siddarthkay commented 2 weeks ago

problem running status-desktop after on Intel MacOS after upgrading Go to 1.21

IMG_4895

jrainville commented 2 weeks ago

@siddarthkay do you think it's possible to fix it or we should revert the upgrade to 1.21 for now?

siddarthkay commented 2 weeks ago

@jrainville : I plan to do some investigations tomorrow and find the root cause of crash, if it is indeed related to go upgrade then the proper fix would most likely be additional build flags. I'll keep this thread posted.

siddarthkay commented 1 week ago

The core error message worth investigating is this :

fatal error: bad sweepgen in refill

There is indeed an issue in go repo for this : -> https://github.com/golang/go/issues/64492

Other possibly related issues : -> https://github.com/golang/go/issues/38692#issuecomment-1879651329

Samyoul commented 1 week ago

@siddarthkay Your magic binary worked for me

Samyoul commented 1 week ago

Hmmmm, @siddarthkay :(

So I've tried to open the Desktop app again today and it is crashing before even loading. I haven't changed anything.

-------------------------------------
Translated Report (Full Report Below)
-------------------------------------

Process:               nim_status_client [27134]
Path:                  /Applications/Status.app/Contents/MacOS/nim_status_client
Identifier:            im.Status.NimStatusClient
Version:               1.0.0 (???)
Code Type:             X86-64 (Native)
Parent Process:        launchd [1]
User ID:               501

Date/Time:             2024-06-17 12:46:04.4758 +0100
OS Version:            macOS 13.6.7 (22G720)
Report Version:        12
Bridge OS Version:     8.5 (21P5077)
Anonymous UUID:        5A658639-5B56-59EC-5890-9183FBDEB115

Sleep/Wake UUID:       D77DCF29-7438-4E35-BE62-8F1427CA292E

Time Awake Since Boot: 3900 seconds
Time Since Wake:       143 seconds

System Integrity Protection: enabled

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000000000000
Exception Codes:       0x0000000000000001, 0x0000000000000000

VM Region Info: 0 is not in any region.  Bytes before following region: 4500475904
      REGION TYPE                    START - END         [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      UNUSED SPACE AT START
--->  
      __TEXT                      10c3fd000-10d281000    [ 14.5M] r-x/r-x SM=COW  ...status_client

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libkeycard.dylib                       0x10f5f3a52 0x10f5dc000 + 96850
1   libkeycard.dylib                       0x10f5f3e28 0x10f5dc000 + 97832
2   libkeycard.dylib                       0x10f5ec2df 0x10f5dc000 + 66271
3   libkeycard.dylib                       0x10f5ec6a5 0x10f5dc000 + 67237
4   libkeycard.dylib                       0x10f7490e8 crosscall2 + 1025512
5   libkeycard.dylib                       0x10f74115c crosscall2 + 992860
6   libkeycard.dylib                       0x10f749f59 crosscall2 + 1029209
7   libkeycard.dylib                       0x10f74a25e crosscall2 + 1029982
8   libkeycard.dylib                       0x10f5e3b13 0x10f5dc000 + 31507
9   libkeycard.dylib                       0x10f5e37c5 0x10f5dc000 + 30661
10  libkeycard.dylib                       0x10f646c49 _cgo_topofstack + 9289
11  libkeycard.dylib                       0x10f6445ec 0x10f5dc000 + 427500
12  libkeycard.dylib                       0x10f64eb3d crosscall2 + 61
13  libkeycard.dylib                       0x10f74a77a KeycardInitFlow + 58
14  nim_status_client                      0x10c5d1cdd 0x10c3fd000 + 1920221
15  nim_status_client                      0x10c7674de 0x10c3fd000 + 3581150
16  nim_status_client                      0x10cb66f72 0x10c3fd000 + 7774066
17  nim_status_client                      0x10cb6ab11 0x10c3fd000 + 7789329
18  nim_status_client                      0x10cbd97bb main + 59
19  dyld                                0x7ff81b8ea41f start + 1903
...
siddarthkay commented 1 week ago

@Samyoul : Thanks for the report, this was also reported on the PR here https://github.com/status-im/status-desktop/pull/15194#issuecomment-2173000727

siddarthkay commented 2 days ago

Now that I have access to an x86_64 MacOS Host I've Investigated a bit further.

The crash happens in this function in status-go : MultiAccountGenerateAndDeriveAddresses which exists here -> https://github.com/status-im/status-go/blob/1d1d6e3276b4890f553cb637164181230f7133d5/account/generator/generator.go#L163

stacktrace ->

fatal error: bad sweepgen in refill

goroutine 17 [running, locked to thread]:
runtime.throw({0x114969872?, 0x1c000c49d08?})
/usr/local/go/src/runtime/panic.go:1077 +0x5c fp=0x1c000c49cc0 sp=0x1c000c49c90 pc=0x112b9433c
runtime.(*mcache).refill(0x10be115b8, 0x80?)
/usr/local/go/src/runtime/mcache.go:157 +0x20d fp=0x1c000c49d00 sp=0x1c000c49cc0 pc=0x112b7270d
runtime.(*mcache).nextFree(0x10be115b8, 0x1b)
/usr/local/go/src/runtime/malloc.go:929 +0x85 fp=0x1c000c49d48 sp=0x1c000c49d00 pc=0x112b69365
runtime.mallocgc(0xa4, 0x0, 0x0)
/usr/local/go/src/runtime/malloc.go:1116 +0x448 fp=0x1c000c49db0 sp=0x1c000c49d48 pc=0x112b69928
runtime.rawstring(...)
/usr/local/go/src/runtime/string.go:267
runtime.gostring(0x0?)
/usr/local/go/src/runtime/string.go:323 +0x3e fp=0x1c000c49df0 sp=0x1c000c49db0 pc=0x112bc745e
main._Cfunc_GoString(...)
_cgo_gotypes.go:72
main.MultiAccountGenerateAndDeriveAddresses(0x0?)
/Users/jenkins/workspace/rs_macos_x86_64_package_PR-15194/vendor/status-go/build/bin/statusgo-lib/main.go:44 
+0x14 fp=0x1c000c49e10 sp=0x1c000c49df0 pc=0x114784514
_cgoexp_2cc8b9dd78ee_MultiAccountGenerateAndDeriveAddresses(0x7ff7b58acc90)
_cgo_gotypes.go:173 +0x1e fp=0x1c000c49e30 sp=0x1c000c49e10 pc=0x114786d1e
runtime.cgocallbackg1(0x114786d00, 0x1c000c49fe0?, 0x0)
/usr/local/go/src/runtime/cgocall.go:399 +0x2b3 fp=0x1c000c49f00 sp=0x1c000c49e30 pc=0x112b61053
runtime.cgocallbackg(0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/cgocall.go:315 +0x125 fp=0x1c000c49f90 sp=0x1c000c49f00 pc=0x112b60d05
runtime.cgocallbackg(0x114786d00, 0x7ff7b58acc90, 0x0)
<autogenerated>:1 +0x29 fp=0x1c000c49fb8 sp=0x1c000c49f90 pc=0x112bcd9c9
runtime.cgocallback(0x0, 0x0, 0x0)
/usr/local/go/src/runtime/asm_amd64.s:1035 +0xcc fp=0x1c000c49fe0 sp=0x1c000c49fb8 pc=0x112bcafac
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0x1c000c49fe8 sp=0x1c000c49fe0 pc=0x112bcb1e1

Next steps would be to narrow down which line of code triggers this crash by pointing to a branch in status-go and decorating it with logs.