android / ndk

The Android Native Development Kit
2k stars 257 forks source link

[BUG] seccomp issues with asan on x86_64 API 27 #1298

Closed DanAlbert closed 1 year ago

DanAlbert commented 4 years ago

Description

2020-06-26 13:36:00.621 4433-4433/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
2020-06-26 13:36:00.621 4433-4433/? A/DEBUG: Build fingerprint: 'Android/sdk_phone_x86_64/generic_x86_64:8.1.0/OSM1.180201.023/4931629:userdebug/test-keys'
2020-06-26 13:36:00.621 4433-4433/? A/DEBUG: Revision: '0'
2020-06-26 13:36:00.621 4433-4433/? A/DEBUG: ABI: 'x86_64'
2020-06-26 13:36:00.621 4433-4433/? A/DEBUG: pid: 4426, tid: 4426, name: app_process64  >>> /system/bin/app_process64 <<<
2020-06-26 13:36:00.621 4433-4433/? A/DEBUG: signal 31 (SIGSYS), code 1 (SYS_SECCOMP), fault addr --------
2020-06-26 13:36:00.621 4433-4433/? A/DEBUG: Cause: seccomp prevented call to disallowed x86_64 system call 0
2020-06-26 13:36:00.621 4433-4433/? A/DEBUG:     rax 0000000000000059  rbx 0000735b2cca0020  rcx ffffffffffffffff  rdx 0000000000001000
2020-06-26 13:36:00.621 4433-4433/? A/DEBUG:     rsi 0000735b2cca0020  rdi 0000735b2cb68fe6
2020-06-26 13:36:00.621 4433-4433/? A/DEBUG:     r8  0000735b2cead9c0  r9  0000000000000000  r10 0000000080000000  r11 0000000000000246
2020-06-26 13:36:00.621 4433-4433/? A/DEBUG:     r12 0000735b2fe2a394  r13 0000735b2ce6de90  r14 0000000000001000  r15 0000735b2fe2d134
2020-06-26 13:36:00.621 4433-4433/? A/DEBUG:     cs  0000000000000033  ss  000000000000002b
2020-06-26 13:36:00.621 4433-4433/? A/DEBUG:     rip 0000735b2cb95f3e  rbp 0000000000000001  rsp 00007ffdff2a2200  eflags 0000000000000246
2020-06-26 13:36:00.622 4433-4433/? A/DEBUG: backtrace:
2020-06-26 13:36:00.622 4433-4433/? A/DEBUG:     #00 pc 000000000004ef3e  /data/app/com.android.developer.asantest-vQuLL4gBdLrwwi-20Wtorw==/lib/x86_64/libclang_rt.asan-x86_64-android.so (offset 0x48000)
2020-06-26 13:36:00.622 4433-4433/? A/DEBUG:     #01 pc 000000000004a19e  /data/app/com.android.developer.asantest-vQuLL4gBdLrwwi-20Wtorw==/lib/x86_64/libclang_rt.asan-x86_64-android.so (offset 0x48000)
2020-06-26 13:36:00.622 4433-4433/? A/DEBUG:     #02 pc 00000000000b89e8  /data/app/com.android.developer.asantest-vQuLL4gBdLrwwi-20Wtorw==/lib/x86_64/libclang_rt.asan-x86_64-android.so (offset 0x48000)
2020-06-26 13:36:00.622 4433-4433/? A/DEBUG:     #03 pc 000000000008b26a  /data/app/com.android.developer.asantest-vQuLL4gBdLrwwi-20Wtorw==/lib/x86_64/libclang_rt.asan-x86_64-android.so (offset 0x48000) (pthread_mutex_lock+42)
2020-06-26 13:36:00.622 4433-4433/? A/DEBUG:     #04 pc 00000000000aaeeb  /system/lib64/libc.so (jemalloc_constructor+91)
2020-06-26 13:36:00.622 4433-4433/? A/DEBUG:     #05 pc 0000000000027a9f  /system/bin/linker64 (__dl__ZL10call_arrayIPFviPPcS1_EEvPKcPT_mbS5_+255)
2020-06-26 13:36:00.622 4433-4433/? A/DEBUG:     #06 pc 0000000000027ce9  /system/bin/linker64 (__dl__ZN6soinfo17call_constructorsEv+441)
2020-06-26 13:36:00.622 4433-4433/? A/DEBUG:     #07 pc 0000000000027bc8  /system/bin/linker64 (__dl__ZN6soinfo17call_constructorsEv+152)
2020-06-26 13:36:00.622 4433-4433/? A/DEBUG:     #08 pc 0000000000027bc8  /system/bin/linker64 (__dl__ZN6soinfo17call_constructorsEv+152)
2020-06-26 13:36:00.622 4433-4433/? A/DEBUG:     #09 pc 00000000000237e0  /system/bin/linker64 (__dl___linker_init+3712)
2020-06-26 13:36:00.622 4433-4433/? A/DEBUG:     #10 pc 000000000002a5e7  /system/bin/linker64 (_start+7)
2020-06-26 13:36:00.622 4433-4433/? A/DEBUG:     #11 pc 0000000000000007  <unknown>

Isn't syscall 0 read (https://cs.android.com/android/platform/superproject/+/master:bionic/libc/kernel/uapi/asm-x86/asm/unistd_64.h;l=21;drc=bb9fcb46361ddb55aac7faf639de5088a09b9b8e)? That can't be right.

https://github.com/DanAlbert/asan-seccomp-repro repros on the API 27 x86_64 emulator.

Environment Details

rpattabi commented 4 years ago

We see this error on arm64-v8a device as well.

DanAlbert commented 4 years ago

@eugenis any idea?

shrukul commented 2 years ago

Hi @DanAlbert -

I'm trying to run Asan on our native library (android), and I'm seeing an error very similar to above. Here's the logcat -

2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: Build fingerprint: 'Android/sdk_phone_x86_64/generic_x86_64:9/PSR1.180720.012/4923214:userdebug/test-keys'
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: Revision: '0'
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: ABI: 'x86_64'
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: pid: 6941, tid: 6941, name: app_process64  >>> /system/bin/app_process64 <<<
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: signal 31 (SIGSYS), code 1 (SYS_SECCOMP), fault addr --------
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: Cause: seccomp prevented call to disallowed x86_64 system call 89
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG:     rax 0000000000000059  rbx 00007267fd924b20  rcx 00007267fd818b1e  rdx 0000000000001000
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG:     r8  00007ffe1c1ecfc8  r9  0000000000000001  r10 00007ffe1c1ec000  r11 0000000000000246
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG:     r12 00007ffe1c1ec000  r13 00007ffe1c1ec298  r14 0000000000001000  r15 00007267ff344798
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG:     rdi 00007267fd7ea9e3  rsi 00007267fd924b20
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG:     rbp 00007ffe1c1cb1c0  rsp 00007ffe1c1ca8e0  rip 00007267fd818b1e
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: backtrace:
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG:     #00 pc 000000000004db1e  /data/app/com.adobe.gude_test-ejX-VdtCm6DJLUsTlBtEcw==/lib/x86_64/libclang_rt.asan-x86_64-android.so (offset 0x46000)
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG:     #01 pc 0000000000048604  /data/app/com.adobe.gude_test-ejX-VdtCm6DJLUsTlBtEcw==/lib/x86_64/libclang_rt.asan-x86_64-android.so (offset 0x46000)
2021-12-22 13:26:43.998 7371-7487/system_process W/NativeCrashListener: Couldn't find ProcessRecord for pid 6941
2021-12-22 13:26:44.000 1765-1765/? E//system/bin/tombstoned: Tombstone written to: /data/tombstones/tombstone_49
  1. seccomp prevented call to disallowed x86_64 system call 89 -> Does this indicate that readlink is failing?
  2. Any idea on what I'm doing wrong / how to get past this error?

NDK Version: r22b, 23.1.7779620 Build system: CMake Host OS: macOS ABI: x86_64 (others untested) NDK API level: 28 Device API level: 28

DanAlbert commented 2 years ago

@enh-google any idea? I'm not very familiar with seccomp and the results we've seen here seem completely broken.

shrukul commented 2 years ago

Hi @DanAlbert / @enh-google - I'm facing a similar issue with API level 30 as well. This time, the error is -

Cause: seccomp prevented call to disallowed x86_64 system call 4
  1. As per the documentation, it looks like NDK officially supports ASAN. Do you think there's something I'm doing wrong?
  2. I had an extra question, unrelated to this issue - To detect memory leaks in an NDK library, what is the preferred way in Android? Valgrind? (I'm fairly new to NDK, so asking you guys!)

Please note that I'm trying running ASAN on x86_64 android emulator on macOS. Hope this is alright.

DanAlbert commented 2 years ago

As per the documentation, it looks like NDK officially supports ASAN. Do you think there's something I'm doing wrong?

We do support it and there definitely appears to be a bug here, but we don't know what it is yet.

I had an extra question, unrelated to this issue - To detect memory leaks in an NDK library, what is the preferred way in Android? Valgrind? (I'm fairly new to NDK, so asking you guys!)

https://github.com/android/ndk/issues/431. I don't know how to do this currently.

enh-google commented 2 years ago

i haven't had time to try to reproduce this yet, but how do we build asan? it looks like SANITIZER_USES_CANONICAL_LINUX_SYSCALLS is there to ensure that you use the new/non-legacy syscalls:

uptr internal_readlink(const char *path, char *buf, uptr bufsize) {
#if SANITIZER_USES_CANONICAL_LINUX_SYSCALLS
  return internal_syscall(SYSCALL(readlinkat), AT_FDCWD, (uptr)path, (uptr)buf,
                          bufsize);
#elif SANITIZER_OPENBSD
  return internal_syscall(SYSCALL(readlinkat), AT_FDCWD, (uptr)path, (uptr)buf,
                          bufsize);
#else
  return internal_syscall(SYSCALL(readlink), (uptr)path, (uptr)buf, bufsize);
#endif
}

but at the same time, we shouldn't even need that because we explicitly allow the legacy syscalls for exactly this reason:

# Needed by sanitizers (b/34606909, b/136777266).
int open:open(const char*, int, ...)  arm,x86,x86_64
int stat64:stat64(const char*, struct stat64*)  arm,x86
ssize_t readlink:readlink(const char*, char*, size_t)  arm,x86,x86_64

so something definitely doesn't make sense here...

shrukul commented 2 years ago

Hi @enh-google, sorry was that question intended for me? As I understand asan comes bundled with ndk, so I'm not very sure how it's built.

enh-google commented 2 years ago

hmm... trying to repro locally, i can't run asan at all:

2022-01-28 16:30:41.071 29496-29496/? I/com.example.myapplication: ==29496==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING.
2022-01-28 16:30:41.071 29496-29496/? I/com.example.myapplication: ==29496==ASan shadow was supposed to be located in the [0x00007fff7000-0x10007fff7fff] range.
2022-01-28 16:30:41.071 29496-29496/? I/com.example.myapplication: ==29496==This might be related to ELF_ET_DYN_BASE change in Linux 4.12.
2022-01-28 16:30:41.071 29496-29496/? I/com.example.myapplication: ==29496==See https://github.com/google/sanitizers/issues/856 for possible workarounds.

one thing i did notice while following the instructions though is that we ask people to copy the compiler_rt .so file into their project --- so it's quite possible that people are (correctly) claiming to use a current NDK but not realizing that they might have an out-of-date compiler_rt .so file copied into their jniLibs directory?

enh-google commented 2 years ago

ASan shadow was supposed to be located in the [0x00007fff7000-0x10007fff7fff] range.

from the maps, it looks like ART's "Sentinel fault page" is the issue?

    0x00007579b000-0x000075f9a000   [anon:dalvik-non moving space]
    0x0000ebad6000-0x0000ebad7000   [anon:dalvik-Sentinel fault page]
    0x5ddc46610000-0x5ddc46612000   /system/bin/app_process64

but that code is old enough that i'm mentioned in the git blame!

enh-google commented 2 years ago

or with x86-64 API 27, i get a null pointer dereference in the linker...

2022-01-28 16:50:10.631 3335-3335/? A/DEBUG: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
...
2022-01-28 16:50:10.735 3335-3335/? A/DEBUG:     #00 pc 000000000000ee14  /system/bin/linker64 (__dl__Z14find_librariesP19android_namespace_tP6soinfoPKPKcmPS2_PNSt3__16vectorIS2_NS8_9allocatorIS2_EEEEmiPK17android_dlextinfobbRNS8_13unordered_mapIPKS1_9ElfReaderNS8_4hashISJ_EENS8_8equal_toISJ_EENSA_INS8_4pairIKSJ_SK_EEEEEEPNS9_IS0_NSA_IS0_EEEE+2260)
2022-01-28 16:50:10.735 3335-3335/? A/DEBUG:     #01 pc 0000000000010a73  /system/bin/linker64 (__dl__Z9do_dlopenPKciPK17android_dlextinfoPKv+1859)
2022-01-28 16:50:10.735 3335-3335/? A/DEBUG:     #02 pc 000000000000d349  /system/bin/linker64 (__dl__Z20__android_dlopen_extPKciPK17android_dlextinfoPKv+57)
2022-01-28 16:50:10.735 3335-3335/? A/DEBUG:     #03 pc 0000000000003015  /system/lib64/libnativeloader.so (android::OpenNativeLibrary(_JNIEnv*, int, char const*, _jobject*, _jstring*, bool*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*)+389)
2022-01-28 16:50:10.735 3335-3335/? A/DEBUG:     #04 pc 000000000039bb2e  /system/lib64/libart.so (art::JavaVMExt::LoadNativeLibrary(_JNIEnv*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, _jobject*, _jstring*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*)+2350)
2022-01-28 16:50:10.735 3335-3335/? A/DEBUG:     #05 pc 00000000000040ce  /system/lib64/libopenjdkjvm.so (JVM_NativeLoad+302)

and the same for x86 API 27 too. for the record, this is with thee Library/Android/sdk/ndk/21.2.6472646/toolchains/llvm/prebuilt/darwin-x86_64/lib64/clang/9.0.8/lib/linux/libclang_rt.asan-*-android.so compiler_rt files. why is that the newest NDK on my machine?!

enh-google commented 2 years ago

i get the same results with Library/Android/sdk/ndk/23.1.7779620/toolchains/llvm/prebuilt/darwin-x86_64/lib64/clang/12.0.8/lib/linux/libclang_rt.asan-x86_64-android.so though, so it's not just that...

enh-google commented 2 years ago

and on a real arm64 device i get

2022-01-28 17:35:22.080 24201-24201/com.example.myapplication I/com.example.myapplication: ==24201==ERROR: AddressSanitizer: SEGV on unknown address 0x1680001ead14f042 (pc 0x00742d686f38 bp 0x007feee7e810 sp 0x007feee7e7e0 T0)
2022-01-28 17:35:22.080 24201-24201/com.example.myapplication I/com.example.myapplication: ==24201==The signal is caused by a READ memory access.
2022-01-28 17:35:22.148 24201-24201/com.example.myapplication I/com.example.myapplication:     #0 0x742d686f38  (/data/app/~~fg3fRqrdKIb5a-nx87vleA==/com.example.myapplication-W_xlVbahk-5nIa3TYdCOwg==/lib/arm64/libmyapplication.so+0x1f38)
2022-01-28 17:35:22.148 24201-24201/com.example.myapplication I/com.example.myapplication:     #1 0x742d686d58  (/data/app/~~fg3fRqrdKIb5a-nx87vleA==/com.example.myapplication-W_xlVbahk-5nIa3TYdCOwg==/lib/arm64/libmyapplication.so+0x1d58)

so i haven't actually managed to get asan working anywhere yet...

shrukul commented 2 years ago

Thanks @enh-google for the investigation and spending your time on the issue!

enh-google commented 2 years ago

yeah, sorry for the obviously poor state this is currently in! we'll keep looking, but in the meantime i did want to check that you know about gwp-asan (https://developer.android.com/ndk/guides/gwp-asan) which -- for the cost of one line in your AndroidManifest.xml -- you can use in the field to detect memory issues (with crashes showing up in the play developer console [or whatever custom solution you're using] just like normal), or hwasan (https://source.android.com/devices/tech/debug/hwasan) which -- while it involves running a custom OS build and is arm64-only -- is really fast hardware-assisted asan; fast enough that we have people using these builds for dogfooding.

enh-google commented 2 years ago

oops, sorry for not updating publicly on this... i've written a lot of words internally but didn't realize until it was pointed out to me by @DanAlbert that i'd said nothing externally.

we started seeing this internally, and talked to the folks who developed the sanitizers. it looks like we've been building them in a way that's incompatible with our seccomp filter. for Android T we've basically just added the missing syscall to the seccomp allowlist. but the folks who owned the sanitizers think it's time to retire this code anyway and just always use the new syscalls (such as openat(2) vs open(2)). so although we can't normally fix the past, once we've pulled that upstream change and rebuilt everything, NDKs after that point should fix this because they won't try to use the obsolete system calls. (no ETA on that yet. http://b/229989971 is the internal bug for any googler who wants to check. https://reviews.llvm.org/D124212 is the upstream LLVM change we're waiting for. that link should work for everyone.)

DanAlbert commented 2 years ago

NDKs after that point

Will be r26 at the earliest. Not a regression so will not be backported to r25 (I expect it's not a clean cherry-pick, so it would be a risky backport).

siddarthkay commented 1 year ago

it would be a risky backport).

Hi @DanAlbert
Can you please let me know if this issue has been backported to NDK "25.2.9519653" or some other version for r25?

Thank you

DanAlbert commented 1 year ago

Nothing will be backported to r25. It is not supported.