Closed DanAlbert closed 1 year ago
We see this error on arm64-v8a
device as well.
@eugenis any idea?
Hi @DanAlbert -
I'm trying to run Asan on our native library (android), and I'm seeing an error very similar to above. Here's the logcat -
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: Build fingerprint: 'Android/sdk_phone_x86_64/generic_x86_64:9/PSR1.180720.012/4923214:userdebug/test-keys'
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: Revision: '0'
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: ABI: 'x86_64'
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: pid: 6941, tid: 6941, name: app_process64 >>> /system/bin/app_process64 <<<
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: signal 31 (SIGSYS), code 1 (SYS_SECCOMP), fault addr --------
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: Cause: seccomp prevented call to disallowed x86_64 system call 89
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: rax 0000000000000059 rbx 00007267fd924b20 rcx 00007267fd818b1e rdx 0000000000001000
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: r8 00007ffe1c1ecfc8 r9 0000000000000001 r10 00007ffe1c1ec000 r11 0000000000000246
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: r12 00007ffe1c1ec000 r13 00007ffe1c1ec298 r14 0000000000001000 r15 00007267ff344798
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: rdi 00007267fd7ea9e3 rsi 00007267fd924b20
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: rbp 00007ffe1c1cb1c0 rsp 00007ffe1c1ca8e0 rip 00007267fd818b1e
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: backtrace:
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: #00 pc 000000000004db1e /data/app/com.adobe.gude_test-ejX-VdtCm6DJLUsTlBtEcw==/lib/x86_64/libclang_rt.asan-x86_64-android.so (offset 0x46000)
2021-12-22 13:26:43.976 6944-6944/? A/DEBUG: #01 pc 0000000000048604 /data/app/com.adobe.gude_test-ejX-VdtCm6DJLUsTlBtEcw==/lib/x86_64/libclang_rt.asan-x86_64-android.so (offset 0x46000)
2021-12-22 13:26:43.998 7371-7487/system_process W/NativeCrashListener: Couldn't find ProcessRecord for pid 6941
2021-12-22 13:26:44.000 1765-1765/? E//system/bin/tombstoned: Tombstone written to: /data/tombstones/tombstone_49
seccomp prevented call to disallowed x86_64 system call 89
-> Does this indicate that readlink is failing?NDK Version: r22b, 23.1.7779620 Build system: CMake Host OS: macOS ABI: x86_64 (others untested) NDK API level: 28 Device API level: 28
@enh-google any idea? I'm not very familiar with seccomp and the results we've seen here seem completely broken.
Hi @DanAlbert / @enh-google - I'm facing a similar issue with API level 30 as well. This time, the error is -
Cause: seccomp prevented call to disallowed x86_64 system call 4
Please note that I'm trying running ASAN on x86_64 android emulator on macOS. Hope this is alright.
As per the documentation, it looks like NDK officially supports ASAN. Do you think there's something I'm doing wrong?
We do support it and there definitely appears to be a bug here, but we don't know what it is yet.
I had an extra question, unrelated to this issue - To detect memory leaks in an NDK library, what is the preferred way in Android? Valgrind? (I'm fairly new to NDK, so asking you guys!)
https://github.com/android/ndk/issues/431. I don't know how to do this currently.
i haven't had time to try to reproduce this yet, but how do we build asan? it looks like SANITIZER_USES_CANONICAL_LINUX_SYSCALLS
is there to ensure that you use the new/non-legacy syscalls:
uptr internal_readlink(const char *path, char *buf, uptr bufsize) {
#if SANITIZER_USES_CANONICAL_LINUX_SYSCALLS
return internal_syscall(SYSCALL(readlinkat), AT_FDCWD, (uptr)path, (uptr)buf,
bufsize);
#elif SANITIZER_OPENBSD
return internal_syscall(SYSCALL(readlinkat), AT_FDCWD, (uptr)path, (uptr)buf,
bufsize);
#else
return internal_syscall(SYSCALL(readlink), (uptr)path, (uptr)buf, bufsize);
#endif
}
but at the same time, we shouldn't even need that because we explicitly allow the legacy syscalls for exactly this reason:
# Needed by sanitizers (b/34606909, b/136777266).
int open:open(const char*, int, ...) arm,x86,x86_64
int stat64:stat64(const char*, struct stat64*) arm,x86
ssize_t readlink:readlink(const char*, char*, size_t) arm,x86,x86_64
so something definitely doesn't make sense here...
Hi @enh-google, sorry was that question intended for me? As I understand asan comes bundled with ndk, so I'm not very sure how it's built.
hmm... trying to repro locally, i can't run asan at all:
2022-01-28 16:30:41.071 29496-29496/? I/com.example.myapplication: ==29496==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING.
2022-01-28 16:30:41.071 29496-29496/? I/com.example.myapplication: ==29496==ASan shadow was supposed to be located in the [0x00007fff7000-0x10007fff7fff] range.
2022-01-28 16:30:41.071 29496-29496/? I/com.example.myapplication: ==29496==This might be related to ELF_ET_DYN_BASE change in Linux 4.12.
2022-01-28 16:30:41.071 29496-29496/? I/com.example.myapplication: ==29496==See https://github.com/google/sanitizers/issues/856 for possible workarounds.
one thing i did notice while following the instructions though is that we ask people to copy the compiler_rt .so file into their project --- so it's quite possible that people are (correctly) claiming to use a current NDK but not realizing that they might have an out-of-date compiler_rt .so file copied into their jniLibs directory?
ASan shadow was supposed to be located in the [0x00007fff7000-0x10007fff7fff] range.
from the maps, it looks like ART's "Sentinel fault page" is the issue?
0x00007579b000-0x000075f9a000 [anon:dalvik-non moving space]
0x0000ebad6000-0x0000ebad7000 [anon:dalvik-Sentinel fault page]
0x5ddc46610000-0x5ddc46612000 /system/bin/app_process64
but that code is old enough that i'm mentioned in the git blame!
or with x86-64 API 27, i get a null pointer dereference in the linker...
2022-01-28 16:50:10.631 3335-3335/? A/DEBUG: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
...
2022-01-28 16:50:10.735 3335-3335/? A/DEBUG: #00 pc 000000000000ee14 /system/bin/linker64 (__dl__Z14find_librariesP19android_namespace_tP6soinfoPKPKcmPS2_PNSt3__16vectorIS2_NS8_9allocatorIS2_EEEEmiPK17android_dlextinfobbRNS8_13unordered_mapIPKS1_9ElfReaderNS8_4hashISJ_EENS8_8equal_toISJ_EENSA_INS8_4pairIKSJ_SK_EEEEEEPNS9_IS0_NSA_IS0_EEEE+2260)
2022-01-28 16:50:10.735 3335-3335/? A/DEBUG: #01 pc 0000000000010a73 /system/bin/linker64 (__dl__Z9do_dlopenPKciPK17android_dlextinfoPKv+1859)
2022-01-28 16:50:10.735 3335-3335/? A/DEBUG: #02 pc 000000000000d349 /system/bin/linker64 (__dl__Z20__android_dlopen_extPKciPK17android_dlextinfoPKv+57)
2022-01-28 16:50:10.735 3335-3335/? A/DEBUG: #03 pc 0000000000003015 /system/lib64/libnativeloader.so (android::OpenNativeLibrary(_JNIEnv*, int, char const*, _jobject*, _jstring*, bool*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*)+389)
2022-01-28 16:50:10.735 3335-3335/? A/DEBUG: #04 pc 000000000039bb2e /system/lib64/libart.so (art::JavaVMExt::LoadNativeLibrary(_JNIEnv*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, _jobject*, _jstring*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>*)+2350)
2022-01-28 16:50:10.735 3335-3335/? A/DEBUG: #05 pc 00000000000040ce /system/lib64/libopenjdkjvm.so (JVM_NativeLoad+302)
and the same for x86 API 27 too. for the record, this is with thee Library/Android/sdk/ndk/21.2.6472646/toolchains/llvm/prebuilt/darwin-x86_64/lib64/clang/9.0.8/lib/linux/libclang_rt.asan-*-android.so compiler_rt files. why is that the newest NDK on my machine?!
i get the same results with Library/Android/sdk/ndk/23.1.7779620/toolchains/llvm/prebuilt/darwin-x86_64/lib64/clang/12.0.8/lib/linux/libclang_rt.asan-x86_64-android.so though, so it's not just that...
and on a real arm64 device i get
2022-01-28 17:35:22.080 24201-24201/com.example.myapplication I/com.example.myapplication: ==24201==ERROR: AddressSanitizer: SEGV on unknown address 0x1680001ead14f042 (pc 0x00742d686f38 bp 0x007feee7e810 sp 0x007feee7e7e0 T0)
2022-01-28 17:35:22.080 24201-24201/com.example.myapplication I/com.example.myapplication: ==24201==The signal is caused by a READ memory access.
2022-01-28 17:35:22.148 24201-24201/com.example.myapplication I/com.example.myapplication: #0 0x742d686f38 (/data/app/~~fg3fRqrdKIb5a-nx87vleA==/com.example.myapplication-W_xlVbahk-5nIa3TYdCOwg==/lib/arm64/libmyapplication.so+0x1f38)
2022-01-28 17:35:22.148 24201-24201/com.example.myapplication I/com.example.myapplication: #1 0x742d686d58 (/data/app/~~fg3fRqrdKIb5a-nx87vleA==/com.example.myapplication-W_xlVbahk-5nIa3TYdCOwg==/lib/arm64/libmyapplication.so+0x1d58)
so i haven't actually managed to get asan working anywhere yet...
Thanks @enh-google for the investigation and spending your time on the issue!
yeah, sorry for the obviously poor state this is currently in! we'll keep looking, but in the meantime i did want to check that you know about gwp-asan (https://developer.android.com/ndk/guides/gwp-asan) which -- for the cost of one line in your AndroidManifest.xml -- you can use in the field to detect memory issues (with crashes showing up in the play developer console [or whatever custom solution you're using] just like normal), or hwasan (https://source.android.com/devices/tech/debug/hwasan) which -- while it involves running a custom OS build and is arm64-only -- is really fast hardware-assisted asan; fast enough that we have people using these builds for dogfooding.
oops, sorry for not updating publicly on this... i've written a lot of words internally but didn't realize until it was pointed out to me by @DanAlbert that i'd said nothing externally.
we started seeing this internally, and talked to the folks who developed the sanitizers. it looks like we've been building them in a way that's incompatible with our seccomp filter. for Android T we've basically just added the missing syscall to the seccomp allowlist. but the folks who owned the sanitizers think it's time to retire this code anyway and just always use the new syscalls (such as openat(2) vs open(2)). so although we can't normally fix the past, once we've pulled that upstream change and rebuilt everything, NDKs after that point should fix this because they won't try to use the obsolete system calls. (no ETA on that yet. http://b/229989971 is the internal bug for any googler who wants to check. https://reviews.llvm.org/D124212 is the upstream LLVM change we're waiting for. that link should work for everyone.)
NDKs after that point
Will be r26 at the earliest. Not a regression so will not be backported to r25 (I expect it's not a clean cherry-pick, so it would be a risky backport).
it would be a risky backport).
Hi @DanAlbert
Can you please let me know if this issue has been backported to NDK "25.2.9519653" or some other version for r25?
Thank you
Nothing will be backported to r25. It is not supported.
Description
Isn't syscall 0
read
(https://cs.android.com/android/platform/superproject/+/master:bionic/libc/kernel/uapi/asm-x86/asm/unistd_64.h;l=21;drc=bb9fcb46361ddb55aac7faf639de5088a09b9b8e)? That can't be right.https://github.com/DanAlbert/asan-seccomp-repro repros on the API 27 x86_64 emulator.
Environment Details