gnustep / libobjc2

Objective-C runtime library intended for use with Clang.
http://www.gnustep.org/
MIT License
426 stars 116 forks source link

UnexpectedException test failing on ARM ABIs #247

Open triplef opened 9 months ago

triplef commented 9 months ago

This test was recently added in #220 and subsequently disabled on ARM (https://github.com/gnustep/libobjc2/commit/226455bd10d8b4a0a090449b5b7ebf1e42daf0a5) due to failing on the cross-build ARM CI targets.

I’m able to reproduce it on an arm64 Android emulator with this result in gdb:

Program received signal SIGABRT, Aborted.
0x0000007fbdee5f74 in abort () from target:/apex/com.android.runtime/lib64/bionic/libc.so
(gdb) bt
#0  0x0000007fbdee5f74 in abort () from target:/apex/com.android.runtime/lib64/bionic/libc.so
#1  0x0000007fbe2e7178 in objc_exception_throw (object=0x5555558ee0 <.objc_str_Exception>)
    at eh_personality.c:261
#2  0x0000005555556a4c in main ()
    at UnexpectedException.m:38

Any idea what might be going on here? Interestingly the exception hook is working fine in our app on Android ARM devices and has been for years.

Steps to reproduce with an Android emulator (on macOS ARM host, NDK paths may vary):

adb push libobjc.so /data/local/tmp
adb push Test/UnexpectedException /data/local/tmp
adb push $ANDROID_NDK_ROOT/sources/cxx-stl/llvm-libc++/libs/arm64-v8a/libc++_shared.so /data/local/tmp

# run test
adb shell LD_LIBRARY_PATH=/data/local/tmp/ /data/local/tmp/UnexpectedException

# debug with gdb
adb push $ANDROID_NDK_ROOT/prebuilt/android-arm64/gdbserver/gdbserver /data/local/tmp
adb forward tcp:5039 tcp:5039
adb shell
> cd /data/local/tmp
> LD_LIBRARY_PATH=$PWD ./gdbserver :5039 ./UnexpectedException

# on host machine
$ANDROID_NDK_ROOT/prebuilt/darwin-x86_64/bin/gdb
> target remote :5039
> c
davidchisnall commented 9 months ago

This abort is at the end of the throw function. The unwind library is returning from the unwind function, which happens only if unwinding fails. For some reason, it looks as if it is not reporting the end of the stack as the reason. Can you see what the value of err is? If you define DEBUG_EXCEPTIONS At the top of the file, it will log it (and a load of other info) for you.

triplef commented 9 months ago

These are the logs I get:

Exception caught by C++: 0
Throwing 0x5555558ee0
Throw returned -1115110992

(I commented out the log "Throwing %p, in flight exception: %p" because it doesn’t build as td->lastThrownObject doesn’t exist.)

triplef commented 9 months ago

Sometimes it also returns -1113013840, but seemingly always either that or more often -1115110992.

davidchisnall commented 9 months ago

Hmm, that's deeply strange. That looks like _Unwind_RaiseException is returning something that isn't a valid enum value, and possibly not even a valid integer. I've run this on FreeBSD/AArch64 (which uses the LLVM unwind library), and it passes.

Does Android use the GNU unwinder? Can you look in _Unwind_RaiseException and see if you can see what it thinks it's returning?

triplef commented 9 months ago

Interestingly when running without the debugger the err value seems to be random.

I’m not sure about which unwinder Android uses, and I haven’t been able to locate the sources for _Unwind_RaiseException so far. I’d also be interested to know whether the failure is the same using cross-builds like on the CI targets. If so maybe it’s easier to debug this there?

Unfortunately I don’t have time to dig into this deeper right now, but I wanted to at least document this issue here.

hmelder commented 2 months ago

Can you look in _Unwind_RaiseException and see if you can see what it thinks it's returning?

The error originates from the libgcc implementation of _Unwind_RaiseException and the generated asm code. Somehow all registers from the start of _Unwind_RaiseException are reloaded before returning _URC_END_OF_STACK in unwind.inc:108. As a result the first parameter passed gets returned.

Note that uw_frame_state_for (&cur_context, &fs); returns _URC_END_OF_STACK.

I just requested an account creation for GCC Bugzilla.

Dump of _Unwind_RaiseException ```asm libgcc_s.so.1`_Unwind_RaiseException: 0xfffff7f27700 <+0>: sub sp, sp, #0xc10 0xfffff7f27704 <+4>: stp x29, x30, [sp] 0xfffff7f27708 <+8>: mov x29, sp 0xfffff7f2770c <+12>: xpaclri 0xfffff7f27710 <+16>: stp x21, x22, [sp, #0x40] 0xfffff7f27714 <+20>: add x22, sp, #0xc0 0xfffff7f27718 <+24>: add x21, sp, #0x840 0xfffff7f2771c <+28>: stp x0, x1, [sp, #0x10] 0xfffff7f27720 <+32>: add x1, sp, #0xc10 0xfffff7f27724 <+36>: stp x2, x3, [sp, #0x20] 0xfffff7f27728 <+40>: mov x2, x30 0xfffff7f2772c <+44>: stp x19, x20, [sp, #0x30] 0xfffff7f27730 <+48>: mov x20, x0 0xfffff7f27734 <+52>: add x19, sp, #0x480 0xfffff7f27738 <+56>: mov x0, x22 0xfffff7f2773c <+60>: stp x23, x24, [sp, #0x50] 0xfffff7f27740 <+64>: stp x25, x26, [sp, #0x60] 0xfffff7f27744 <+68>: stp x27, x28, [sp, #0x70] 0xfffff7f27748 <+72>: stp d8, d9, [sp, #0x80] 0xfffff7f2774c <+76>: stp d10, d11, [sp, #0x90] 0xfffff7f27750 <+80>: stp d12, d13, [sp, #0xa0] 0xfffff7f27754 <+84>: stp d14, d15, [sp, #0xb0] 0xfffff7f27758 <+88>: bl 0xfffff7f27020 ; uw_init_context_1 at unwind-dw2.c:1324:1 0xfffff7f2775c <+92>: mov x1, x22 0xfffff7f27760 <+96>: mov x0, x19 0xfffff7f27764 <+100>: mov x2, #0x3c0 ; =960 0xfffff7f27768 <+104>: bl 0xfffff7f13740 0xfffff7f2776c <+108>: b 0xfffff7f277a0 ; <+160> at unwind.inc:104:14 0xfffff7f27770 <+112>: cbnz w2, 0xfffff7f27814 ; <+276> at unwind.inc:113:9 0xfffff7f27774 <+116>: ldr x5, [sp, #0xbe0] 0xfffff7f27778 <+120>: cbz x5, 0xfffff7f27794 ; <+148> at unwind.inc:127:7 0xfffff7f2777c <+124>: ldr x2, [x20] 0xfffff7f27780 <+128>: blr x5 0xfffff7f27784 <+132>: cmp w0, #0x6 0xfffff7f27788 <+136>: b.eq 0xfffff7f2781c ; <+284> at unwind.inc:132:18 0xfffff7f2778c <+140>: cmp w0, #0x8 0xfffff7f27790 <+144>: b.ne 0xfffff7f27814 ; <+276> at unwind.inc:113:9 0xfffff7f27794 <+148>: mov x1, x21 0xfffff7f27798 <+152>: mov x0, x19 0xfffff7f2779c <+156>: bl 0xfffff7f27264 ; uw_update_context at unwind-dw2.c:1266:1 0xfffff7f277a0 <+160>: mov x1, x21 0xfffff7f277a4 <+164>: mov x0, x19 0xfffff7f277a8 <+168>: bl 0xfffff7f25f00 ; uw_frame_state_for at unwind-dw2.c:997:3 0xfffff7f277ac <+172>: mov w2, w0 0xfffff7f277b0 <+176>: mov w1, #0x1 ; =1 0xfffff7f277b4 <+180>: mov x4, x19 0xfffff7f277b8 <+184>: mov x3, x20 0xfffff7f277bc <+188>: mov w0, w1 0xfffff7f277c0 <+192>: cmp w2, #0x5 0xfffff7f277c4 <+196>: b.ne 0xfffff7f27770 ; <+112> at unwind.inc:110:10 -> 0xfffff7f277c8 <+200>: mov x4, #0x0 ; =0 0xfffff7f277cc <+204>: mov w0, w2 0xfffff7f277d0 <+208>: ldp x29, x30, [sp] 0xfffff7f277d4 <+212>: ldp x0, x1, [sp, #0x10] 0xfffff7f277d8 <+216>: ldp x2, x3, [sp, #0x20] 0xfffff7f277dc <+220>: ldp x19, x20, [sp, #0x30] 0xfffff7f277e0 <+224>: ldp x21, x22, [sp, #0x40] 0xfffff7f277e4 <+228>: ldp x23, x24, [sp, #0x50] 0xfffff7f277e8 <+232>: ldp x25, x26, [sp, #0x60] 0xfffff7f277ec <+236>: ldp x27, x28, [sp, #0x70] 0xfffff7f277f0 <+240>: ldp d8, d9, [sp, #0x80] 0xfffff7f277f4 <+244>: ldp d10, d11, [sp, #0x90] 0xfffff7f277f8 <+248>: ldp d12, d13, [sp, #0xa0] 0xfffff7f277fc <+252>: ldp d14, d15, [sp, #0xb0] 0xfffff7f27800 <+256>: add sp, sp, #0xc10 0xfffff7f27804 <+260>: cbz x4, 0xfffff7f27810 ; <+272> at unwind.inc:141:1 0xfffff7f27808 <+264>: add sp, sp, x5 0xfffff7f2780c <+268>: br x6 0xfffff7f27810 <+272>: ret 0xfffff7f27814 <+276>: mov w2, #0x3 ; =3 0xfffff7f27818 <+280>: b 0xfffff7f277c8 ; <+200> at unwind.inc:141:1 0xfffff7f2781c <+284>: str xzr, [x20, #0x10] 0xfffff7f27820 <+288>: mov x0, x19 0xfffff7f27824 <+292>: bl 0xfffff7f13860 ; symbol stub for: pthread_key_create 0xfffff7f27828 <+296>: mov x4, x0 0xfffff7f2782c <+300>: ldr x3, [sp, #0x7c0] 0xfffff7f27830 <+304>: mov x1, x22 0xfffff7f27834 <+308>: mov x2, #0x3c0 ; =960 0xfffff7f27838 <+312>: mov x0, x19 0xfffff7f2783c <+316>: sub x3, x4, x3, lsr #63 0xfffff7f27840 <+320>: str x3, [x20, #0x18] 0xfffff7f27844 <+324>: bl 0xfffff7f13740 0xfffff7f27848 <+328>: mov x2, x21 0xfffff7f2784c <+332>: mov x1, x19 0xfffff7f27850 <+336>: mov x0, x20 0xfffff7f27854 <+340>: bl 0xfffff7f273ac ; _Unwind_RaiseException_Phase2 at unwind.inc:41:1 0xfffff7f27858 <+344>: mov w2, #0x2 ; =2 0xfffff7f2785c <+348>: cmp w0, #0x7 0xfffff7f27860 <+352>: b.ne 0xfffff7f277c8 ; <+200> at unwind.inc:141:1 0xfffff7f27864 <+356>: mov x1, x19 0xfffff7f27868 <+360>: mov x0, x22 0xfffff7f2786c <+364>: bl 0xfffff7f24a60 ; uw_install_context_1 at unwind-dw2.c:1405:1 0xfffff7f27870 <+368>: mov x19, x0 0xfffff7f27874 <+372>: ldr x20, [sp, #0x798] 0xfffff7f27878 <+376>: ldr x0, [sp, #0x790] 0xfffff7f2787c <+380>: mov x1, x20 0xfffff7f27880 <+384>: bl 0xfffff7f276f0 ; _Unwind_DebugHook at unwind-dw2.c:1382:1 0xfffff7f27884 <+388>: bl 0xfffff7f2af40 ; __arm_za_disable 0xfffff7f27888 <+392>: mov x5, x19 0xfffff7f2788c <+396>: mov x6, x20 0xfffff7f27890 <+400>: mov x4, #0x1 ; =1 0xfffff7f27894 <+404>: b 0xfffff7f277cc ; <+204> at unwind.inc:141:1 ```

Current position right after if (code == _URC_END_OF_STACK). The return value of uw_frame_state_for is 5 (_URC_END_OF_STACK).

      if (code == _URC_END_OF_STACK)
    /* Hit end of stack with no handler found.  */
    return _URC_END_OF_STACK;

After mov w0, w2 (-> 0xfffff7f277d0 <+208>: ldp x29, x30, [sp])

0xfffff7f27894 <+404>: b      0xfffff7f277cc ; <+204> at unwind.inc:141:1
(lldb) register read
General Purpose Registers:
        x0 = 0x0000000000000005
        x1 = 0x0000000000000001
        x2 = 0x0000000000000005
        x3 = 0x0000aaaaaaafa790
        x4 = 0x0000000000000000
        x5 = 0x0000ffffffffe160
        x6 = 0xfffffffffffffff8
        x7 = 0x0000000000000004
        x8 = 0x0000000000000001
        x9 = 0x0000fffff7f402a8
       x10 = 0x0000000000000000
       x11 = 0x0000fffff7f40308
       x12 = 0x0000fffff7ff77c0
       x13 = 0x0000000000000010
       x14 = 0x0000000000000000
       x15 = 0x0000fffff7bc63c0  
       x16 = 0x0000fffff7f40000
       x17 = 0x0000000000000000
       x18 = 0x0000000000000007
       x19 = 0x0000ffffffffe610
       x20 = 0x0000aaaaaaafa790
       x21 = 0x0000ffffffffe9d0
       x22 = 0x0000ffffffffe250
       x23 = 0x0000ffffffffefc8
       x24 = 0x0000fffff7ffdb90  ld-linux-aarch64.so.1`_rtld_global_ro
       x25 = 0x0000000000000000
       x26 = 0x0000fffff7ffe008  _rtld_global
       x27 = 0x0000aaaaaaabfda0  UnexpectedExceptionDebug`__do_global_dtors_aux_fini_array_entry
       x28 = 0x0000000000000000
        fp = 0x0000ffffffffe190
        lr = 0x0000fffff7f277ac  libgcc_s.so.1`_Unwind_RaiseException + 172 at unwind.inc:104:14
        sp = 0x0000ffffffffe190
        pc = 0x0000fffff7f277d0  libgcc_s.so.1`_Unwind_RaiseException + 208 at unwind.inc:141:1
      cpsr = 0x60201000

After ldp x0, x1

(lldb) register read
General Purpose Registers:
        x0 = 0x0000aaaaaaafa790
        x1 = 0x0000000000000000
        x2 = 0x0000000000000058
        x3 = 0x0000000000000000
        x4 = 0x0000000000000000
        x5 = 0x0000ffffffffe160
        x6 = 0xfffffffffffffff8
        x7 = 0x0000000000000004
        x8 = 0x0000000000000001
        x9 = 0x0000fffff7f402a8
       x10 = 0x0000000000000000
       x11 = 0x0000fffff7f40308
       x12 = 0x0000fffff7ff77c0
       x13 = 0x0000000000000010
       x14 = 0x0000000000000000
       x15 = 0x0000fffff7bc63c0  
       x16 = 0x0000fffff7f40000
       x17 = 0x0000000000000000
       x18 = 0x0000000000000007
       x19 = 0x0000ffffffffefb8
       x20 = 0x0000000000000001
       x21 = 0x0000aaaaaaabfda0  UnexpectedExceptionDebug`__do_global_dtors_aux_fini_array_entry
       x22 = 0x0000aaaaaaaa0ae8  UnexpectedExceptionDebug`main at UnexpectedException.m:33
       x23 = 0x0000ffffffffefc8
       x24 = 0x0000fffff7ffdb90  ld-linux-aarch64.so.1`_rtld_global_ro
       x25 = 0x0000000000000000
       x26 = 0x0000fffff7ffe008  _rtld_global
       x27 = 0x0000aaaaaaabfda0  UnexpectedExceptionDebug`__do_global_dtors_aux_fini_array_entry
       x28 = 0x0000000000000000
        fp = 0x0000ffffffffee00
        lr = 0x0000fffff7f76d38  libobjc.so.4.6`objc_exception_throw + 520 at eh_personality.c:256:22
        sp = 0x0000ffffffffeda0
        pc = 0x0000fffff7f27810  libgcc_s.so.1`_Unwind_RaiseException + 272 at unwind.inc:141:1
      cpsr = 0x60201000
(lldb) 
hmelder commented 2 months ago

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114843

davidchisnall commented 2 months ago

Thanks for root causing this! Let's leave this open until there's a fix, but until then we can recommend that Arm users use LLVM's unwinder instead of GCC's.

triplef commented 2 months ago

How does one choose between the LLVM vs. GCC unwinder, and is the GCC one also ever used when building with Clang?

I’m a bit confused about this issue because we don’t see issues with exception handling on Android in our app (built with Clang and the NDK toolchain) and the exception hook is working fine too, but I was originally able to reproduce this with the test using the NDK toolchain.

hmelder commented 2 months ago

How does one choose between the LLVM vs. GCC unwinder, and is the GCC one also ever used when building with Clang?

Here is an example:

clang-18 test.c -o test -rtlib=compiler-rt --unwindlib=libunwind -fuse-ld=lld-18

hmelder commented 2 months ago

but I was originally able to reproduce this with the test using the NDK toolchain

Yep thats weird. Do you know the NDK version you ran the test on?

davidchisnall commented 2 months ago

I think Android uses the LLVM unwinder. It's typically integrated in the C support files, but precisely where depends on the platform. FreeBSD uses the LLVM one as well, which is why I didn't see these issues.

It's also not always clear which one you're using. On FreeBSD, we ship the LLVM one but with the same name as the GCC one to avoid configure scripts failing.

I'd report this to distros as a bug and tell them that the simple fix is to use the LLVM one instead of the GCC one.

hmelder commented 2 months ago

Just tested it with libunwind-18 on Ubuntu 23.10 and it works as expected:

hugo@ubuntu:/tmp$ clang -L/usr/lib/llvm-18/lib test.c -o test -rtlib=compiler-rt --unwindlib=libunwind -fuse-ld=lld -lunwind
hugo@ubuntu:/tmp$ ./test
RaiseException returned 0x5

I'll test it on Android this evening.

triplef commented 2 months ago

Do you know the NDK version you ran the test on?

Unfortunately no. I think the NDK migrated to the LLVM toolchain over the last couple of releases, so it might have been a release that was still using the GCC unwinder.

Can we detect in CMake which unwinder is used and already enable the test when LLVM is used?

triplef commented 2 months ago

Just found this re. unwinder on Android:

The unwinder APIs are exposed from the platform's libc.so starting with API 30 (Android R).

hmelder commented 2 months ago

Can we detect in CMake which unwinder is used and already enable the test when LLVM is used?

I would enable it unconditionally. The libgcc patch will probably be backported to gcc 13 and maybe gcc 12.

hmelder commented 2 months ago

Just found this re. unwinder on Android:

The unwinder APIs are exposed from the platform's libc.so starting with API 30 (Android R).

Seems like libgcc was removed in NDK r23. Here is the change in clang: https://reviews.llvm.org/D96403

pinskia commented 2 months ago

I would enable it unconditionally. The libgcc patch will probably be backported to gcc 13 and maybe gcc 12.

s/libgcc/gcc/. The bug is not in libgcc directly but rather the code that GCC produces has the bug.

I hope to get it backported in time for the GCC 11.5 release but we will see. I will be posting the patch over the weekend and I doubt it will be reviewed until Monday or later.

pinskia commented 2 months ago

I should note that a few other targets has a similar bug: powerpc: PR 114846 arm: PR 114847

loongarch had the similar bug but it was fixed in GCC 14: longarch: PR 114848

Those are the "major" targets I tried that had the bug.

pinskia commented 2 months ago

Just an FYI I have posted the GCC patch: https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650080.html .

hmelder commented 2 months ago

Thank you @pinskia!