status-im / status-mobile

a free (libre) open source, mobile OS for Ethereum
https://status.app
Mozilla Public License 2.0
3.9k stars 988 forks source link

Can't install e2e apk on Android 12 and higher #17265

Closed yevh-berdnyk closed 10 months ago

yevh-berdnyk commented 1 year ago

Bug Report

Problem

To keep our e2e tests up to date we need to use relevant Android version. SauceLabs emulators are x86_64 only. And now it's not possible to install and use e2e application on such emulators with Android 12 or more.

Expected behavior

The e2e apk is installed on Android 12

Actual behavior

Screenshot 2023-09-12 at 17 21 13

Reproduction

  1. Create an Android emulator with x86 or x86_64 ABI and API level 31 or more
  2. Try installing e2e application

Additional Information

churik commented 1 year ago

@yakimant @siddarthkay

May be you have an understanding how we can get a compatible build to try it out?

Thank you in advance for your help!

siddarthkay commented 1 year ago

Hi @churik : could you see if installing PRs from https://github.com/status-im/status-mobile/pull/17241 works? We've bumped up the SDKVersions in the upgrade. ref : https://ci.status.im/job/status-mobile/job/prs/job/android-e2e/job/PR-17241/12/artifact/result/StatusIm-Mobile-230914-143157-1b2bed-pr17241-x86.apk

yevh-berdnyk commented 1 year ago

Hi @churik : could you see if installing PRs from #17241 works? We've bumped up the SDKVersions in the upgrade. ref : https://ci.status.im/job/status-mobile/job/prs/job/android-e2e/job/PR-17241/12/artifact/result/StatusIm-Mobile-230914-143157-1b2bed-pr17241-x86.apk

Hi @siddarthkay, I've just tried installing this apk on local Android12 (x86_64) emulator and having the same issue - no matching abis. The same result with the latest apk from the mentioned PR

yakimant commented 1 year ago

Maybe we need to switch to x86_64 for e2e.

As I see images starting with android-31 (Android 12) don't support x86 and maybe they expect binaries to be x86_64 too:

❯ sdkmanager --list | grep system-images | grep "google_apis;"| grep x86 | awk '{print $1}' | sort | uniq
system-images;android-10;google_apis;x86
system-images;android-15;google_apis;x86
system-images;android-16;google_apis;x86
system-images;android-17;google_apis;x86
system-images;android-18;google_apis;x86
system-images;android-19;google_apis;x86
system-images;android-21;google_apis;x86
system-images;android-21;google_apis;x86_64
system-images;android-22;google_apis;x86
system-images;android-22;google_apis;x86_64
system-images;android-23;google_apis;x86
system-images;android-23;google_apis;x86_64
system-images;android-24;google_apis;x86
system-images;android-24;google_apis;x86_64
system-images;android-25;google_apis;x86
system-images;android-25;google_apis;x86_64
system-images;android-26;google_apis;x86
system-images;android-26;google_apis;x86_64
system-images;android-27;google_apis;x86
system-images;android-28;google_apis;x86
system-images;android-28;google_apis;x86_64
system-images;android-29;google_apis;x86
system-images;android-29;google_apis;x86_64
system-images;android-30;google_apis;x86
system-images;android-30;google_apis;x86_64
system-images;android-31;google_apis;x86_64
system-images;android-32;google_apis;x86_64
system-images;android-33;google_apis;x86_64
system-images;android-34;google_apis;x86_64

Let me create a buid for x86_64 to test.

yakimant commented 1 year ago

@churik, can you please try this one: https://ci.status.im/job/status-mobile/job/prs/job/android-e2e/job/PR-17335/1/artifact/result/StatusIm-Mobile-230919-125807-11897b-pr17335-x86_64.apk

I am on ARM cpu, so x86_64 doesn't run without additional setup unfortunately.

yakimant commented 1 year ago

Related pages:

siddarthkay commented 1 year ago

hi @yakimant : same thing for me back when I had tried : x86_64 doesn't run without additional setup unfortunately. I will try on my linux machine tomorrow

yakimant commented 1 year ago

x86 & x86_64 combined: https://ci.status.im/job/status-mobile/job/prs/job/android-e2e/job/PR-17335/2/artifact/result/StatusIm-Mobile-230919-135732-11897b-pr17335-universal.apk

yakimant commented 1 year ago

This can be tested locally too:

  1. Should run x86_64 emulator (check with avdmanager list avd)
  2. make run-android should build for x86_64 (see the first lines of log) If not - you can also run ANDROID_ABI_INCLUDE=x86_64 make run-android to enforce build arch.

@siddarthkay, can you please try to reproduce it locally?

churik commented 1 year ago

So, with builds from https://github.com/status-im/status-mobile/issues/17265#issuecomment-1725478295 I'm able to install apk, but it fails after creating the profile (reproduced on both Android 12 and Android 10)

Logcat: logcat (26).log

It is blocker for e2e. Can you please assist @siddarthkay ?

siddarthkay commented 1 year ago

Sure I will @churik ! I'm afk today but will be able to take a look first thing tomorrow 👍🏻

siddarthkay commented 1 year ago

First look at logcat and this seems to be relevant part of crash log (Keeping it very verbose for now)

--------- beginning of crash
09-19 13:22:02.386  6214  6327 F libc    : Fatal signal 31 (SIGSYS), code 1 (SYS_SECCOMP) in tid 6327 (create_react_co), pid 6214 (tus.ethereum.pr)
09-19 13:22:02.434   763   939 D EGL_emulation: app_time_stats: avg=15518.90ms min=15518.90ms max=15518.90ms count=1
09-19 13:22:02.450  6739  6739 I crash_dump64: obtaining output fd from tombstoned, type: kDebuggerdTombstoneProto
09-19 13:22:02.451   294   294 I tombstoned: received crash request for pid 6327
09-19 13:22:02.452  6739  6739 I crash_dump64: performing dump of process 6214 (target tid = 6327)
09-19 13:22:02.459  6739  6739 E DEBUG   : failed to read /proc/uptime: Permission denied
09-19 13:22:02.517  6739  6739 W unwind  : Failed to initialize DEX file support: dlopen failed: library "libdexfile.so" not found
09-19 13:22:02.552  6142  6188 I appium  : channel read: POST /session/eaf577e7-fbe0-445c-84cc-2c2ae7edc0a5/element
09-19 13:22:02.552  6142  6188 I appium  : FindElement command
09-19 13:22:02.553  6142  6188 I appium  : method: 'xpath', selector: '//*[@text="Maybe later"]'
09-19 13:22:02.554  6142  6188 I appium  : Waiting up to 10000ms for the device to idle
09-19 13:22:02.754  6739  6739 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
09-19 13:22:02.754  6739  6739 F DEBUG   : Build fingerprint: 'google/sdk_gphone64_x86_64/emulator64_x86_64_arm64:12/SE1A.220826.008/10564458:userdebug/dev-keys'
09-19 13:22:02.754  6739  6739 F DEBUG   : Revision: '0'
09-19 13:22:02.754  6739  6739 F DEBUG   : ABI: 'x86_64'
09-19 13:22:02.754  6739  6739 F DEBUG   : Timestamp: 2023-09-19 13:22:02.459559125+0000
09-19 13:22:02.754  6739  6739 F DEBUG   : Process uptime: 0s
09-19 13:22:02.754  6739  6739 F DEBUG   : Cmdline: im.status.ethereum.pr
09-19 13:22:02.754  6739  6739 F DEBUG   : pid: 6214, tid: 6327, name: create_react_co  >>> im.status.ethereum.pr <<<
09-19 13:22:02.754  6739  6739 F DEBUG   : uid: 10149
09-19 13:22:02.754  6739  6739 F DEBUG   : signal 31 (SIGSYS), code 1 (SYS_SECCOMP), fault addr --------
09-19 13:22:02.754  6739  6739 F DEBUG   : Cause: seccomp prevented call to disallowed x86_64 system call 232
09-19 13:22:02.754  6739  6739 F DEBUG   :     rax 00000000000000e8  rbx 00000000000000c0  rcx 000077f062a8f48e  rdx 0000000000000001
09-19 13:22:02.754  6739  6739 F DEBUG   :     r8  0000000000000000  r9  0000000000000000  r10 ffffffffffffffff  r11 0000000000000202
09-19 13:22:02.754  6739  6739 F DEBUG   :     r12 0000000000000000  r13 0000000000000001  r14 000000c000603520  r15 0000000000000001
09-19 13:22:02.754  6739  6739 F DEBUG   :     rdi 00000000000000c0  rsi 000000c000a60f84
09-19 13:22:02.754  6739  6739 F DEBUG   :     rbp 000000c000a60e08  rsp 000000c000a60dc8  rip 000077f062a8f48e
09-19 13:22:02.754  6739  6739 F DEBUG   : backtrace:
09-19 13:22:02.754  6739  6739 F DEBUG   :       #00 pc 00000000012ef48e  /data/app/~~T7K40a971VhyO8hwRX-9Ow==/im.status.ethereum.pr-WWr32Mrjmjy4CYqwDI9a3Q==/lib/x86_64/libgojni.so
09-19 13:22:02.780   294   294 E tombstoned: Tombstone written to: tombstone_01
09-19 13:22:02.780   588  6744 I am_crash: [588,0,im.status.ethereum.pr,951598660,Native crash,Bad system call,unknown,0]
09-19 13:22:02.781   588  6745 I DropBoxManagerService: add tag=data_app_native_crash isTagEnabled=true flags=0x2
09-19 13:22:02.782   588  6744 W ActivityTaskManager:   Force finishing activity im.status.ethereum.pr/im.status.ethereum.MainActivity
09-19 13:22:02.782   588  6744 I wm_finish_activity: [0,153421236,20,im.status.ethereum.pr/im.status.ethereum.MainActivity,force-crash]

maybe something to do with method: 'xpath', selector: '//*[@text="Maybe later"]' investigating further

yakimant commented 1 year ago

I pinned the builds to keep the artifacts from cleaning in case you will need them:

siddarthkay commented 1 year ago

2. run ANDROID_ABI_INCLUDE=x86_64 make run-android

This works for me on linux and I was able to run the app. The builds do get generated on mac, however I can not get an x86_64 emulator to work on arm mac.

I was able to reproduce the crash and it seems to be some very weird error.

2023-09-24 22:13:57.147 10125-10295 libc                    
im.status.ethereum.debug             
A  Fatal signal 31 (SIGSYS), code 1 (SYS_SECCOMP), syscall 232 in tid 10295 (create_react_co), pid 10125 (.ethereum.debug)

Flipper also saw this crash on x86_64 simulators , ref : https://github.com/facebook/flipper/issues/482 It seems they fixed it with bumping ndkVersions. Maybe that can also help us. Investigating further.

siddarthkay commented 1 year ago

maybe something to do with method: 'xpath', selector: '//*[@text="Maybe later"]'

Update on this statement : The issue has nothing to do with e2e code, the app crashes on x86_64 emulators

siddarthkay commented 1 year ago

The error we are interested is documented here : https://source.android.com/docs/core/tests/debug/native-crash#seccomp We can probably ask Android to chill with the illegal Syscall for a bit in debug environments like this : adb shell setenforce 0 && adb stop && adb start

siddarthkay commented 1 year ago

Another error which seems highly suspicious is this one 09-25 07:03:25.267 14380 14544 E GoLog : [0925/070325.267701:ERROR:elf_dynamic_array_reader.h(64)] tag not found which seems to be a known error when apps crash in a linux environment, ref vscode : https://github.com/microsoft/vscode/issues/180648

siddarthkay commented 1 year ago

adb shell setenforce 0 && adb stop && adb start

This did not help much and even if it did it won't be of any use since e2e builds would not be debug variants. I have a hunch that says this crash happens because of either of the 2 libraries :

But I don't have proof yet, The way forward should be to comment out cljs code right after set password screen and try to narrow down the function call which triggers this crash.

jakubgs commented 1 year ago

The crash starts with:

F libc    : Fatal signal 31 (SIGSYS), code 1 (SYS_SECCOMP) in tid 6327 (create_react_co), pid 6214 (tus.ethereum.pr)

According to this post:

Android 8 O (SDK 26) limits which system calls are allowed for security reasons by enabling a feature called secure computing in the Linux kernel.

This means only whitelisted calls can be executed and that any other call will result in signal 31 (SIGSYS), code 1 (SYS_SECCOMP), like you are experiencing. You will need to examine the stack trace of this signal to find out which system call was not allowed (which was not listed completely in your question).

You can find a list of allowed calls here. Any other call is not allowed.

https://stackoverflow.com/questions/45313486/fatal-signal-31-error-when-upgrading-android-app-from-api-24-to-26

jakubgs commented 1 year ago

And based on this line:

Cause: seccomp prevented call to disallowed x86_64 system call 232

It seems pretty obvious that the issue indeed is the Seccomp filter: https://android-developers.googleblog.com/2017/07/seccomp-filter-in-android-o.html

jakubgs commented 1 year ago

I thought we already use ABI version 31: https://github.com/status-im/status-mobile/blob/2df7a7cf6d46c8d1add73b8965ce8b04e6f7d014/android/gradle.properties#L26-L27

And I saw E2E test run and mostly succeed, so I'm confused. When does this issue manifest?

siddarthkay commented 1 year ago

Noting down the flow of what happens when we press the confirm password button.

It does not wait for the entire 7 seconds, the crash generally happens within the 1st 2 seconds and almost immediately as we see the Generating Keys Screen.

The crash could be due to the following :

At this moment I doubt everything and will try to negate assumptions one by one until I can get the crash to stop.

The app does not insta crash which means the x86_64 emulator does not like what ever happens on the Generate Keys screen.

siddarthkay commented 1 year ago

removing function call to createAccountAndLogin caused the crash to go away. The crash is indeed caused by a panic on status-go side. Trying to narrow down on the why so that we can tackle this further. ref : https://github.com/status-im/status-go/compare/develop...login-crash-x86-64

siddarthkay commented 12 months ago

I tried to add some extensive file permissions in a branch I made from the react-native-0.72 upgrade branch here to see if that fixes the issue : https://github.com/status-im/status-mobile/compare/develop...login-crash-x86-64

Turns out the crash was not related to file permissions at OS level but the operations that happened on status-go side after the keystore was generated.

I narrowed down the crash to this function call g.store(childAccount, password) which was here : account/generator/generator.go

I also added extensive logs inside that function ref : https://github.com/status-im/status-go/commit/343e216384d3e372b148df4891ac24cce36298bc to see what went wrong and to my surprise it all looked okay.

I then suspected the root cause to be somewhere in the go-ethereum code, but I was not too sure.

siddarthkay commented 12 months ago

I had a conversation with @bitgamma and he gave me some awesome insights! The crash stack trace says : syscall 232

syscall 232 according to docs is epoll_wait : https://chromium.googlesource.com/chromiumos/docs/+/master/constants/syscalls.md#x86_64-64_bit:~:text=232-,epoll_wait,-man/%20cs

So most likely the go code which calls epoll_wait misbehaves and crashes on x86_64 on android 11 onwards.

References to that are : https://github.com/status-im/status-go/blob/develop/rtt/rtt.go#L76 and https://github.com/status-im/tcp-shaker/blob/master/socket_linux.go#L85 cc @jakubgs Now this seems like a good spot to further continue investigation.

siddarthkay commented 12 months ago

possible spots in status-go that make epoll_wait calls are :

There could be more

siddarthkay commented 12 months ago

This issue at android ndk repo caught my attention : https://github.com/android/ndk/issues/1298

its a seccomp issue with x86_64 emulators which was fixed in ndk r26 and probably backported to ndk r25(not sure) However the error message is not exactly the same as what we face here but I figured its worth a shot to try upgrading ndk to this version to see if it fixes our crash.

we didn't have ndkversion 26 in our nixpkgs so I added them to our fork here : https://github.com/status-im/nixpkgs/commit/cf1c475a5141631ff85d48fbb5e015945df50d37

Unfortunately I couldn't get status-go to build on my mac and it would fail with this error :

 > Building status-go for: android/arm,android/arm64,android/386
 > /nix/store/m1iynw68yqby0xi18djqfqvqh6z52528-gomobile-unstable-2022-05-18/bin/gomobile: 
 go build -tags gowaku_skip_migrations,gowaku_no_rln -ldflags -X github.com/status-im/status-
 go/params.GitCommit=27b770c41bd2896ebc30b01985eeea1fbe642fba -X github.com/status-im/status-
 go/params.IpfsGatewayURL=https://ipfs.status.im/ -X github.com/status-im/status-go/params.Version=0.171.6 
 -s -w -buildmode=c-shared -o=/private/tmp/nix-build-status-go-0.171.6-27b770c-android.drv-0/gomobile-
 work/android/src/main/jniLibs/arm64-v8a/libgojni.so ./gobind failed: exit status 2
 > # github.com/status-im/status-go/vendor/github.com/mutecomm/go-sqlcipher/v4
 > aesce.c:194:18: error: always_inline function 'vaesimcq_u8' requires target feature 'aes',
  but would be inlined into
  function 'mbedtls_aesce_inverse_key' that is compiled without support for 'aes'
make: *** [build-android] Error 1

probably @caybro or @alaibe might have an idea on how to get this to work.

I traced down the error to https://github.com/status-im/go-sqlcipher/blob/master/aesce.c#L194 in our fork of go-sqlcipherpackage but I wasn't sure If its a build flag issue or something else.

I also tried ndkVersion 25.2.9519653 I do get a successful build, but on x86_64 simulators the app crashes right after generating keys stage which means that the issue is still present :)

siddarthkay commented 11 months ago

The ndk upgrade path was a dead end.

possible spots in status-go that make epoll_wait calls are :

patching these libraries with custom build flags to remove syscall for epoll_wait fixes the crash and is a decent enough solution to move forward.