gnustep / libobjc2

Objective-C runtime library intended for use with Clang.
http://www.gnustep.org/
MIT License
426 stars 116 forks source link

AssociatedObject_legacy and AssociatedObject_legacy_optimised fail under Debian 11 aarch64 #228

Closed hmelder closed 1 year ago

hmelder commented 2 years ago

Hi,

I built libobjc2 (master 7dee32a) on Debian 11 aarch64. The AssociatedObject_legacy and AssociatedObject_legacy_optimised tests fail.

LSB Release Information

Distributor ID: Debian
Description:    Debian GNU/Linux 11 (bullseye)
Release:    11
Architecture:   aarch64
Codename:   bullseye

Clang Version:

Debian clang version 11.0.1-2
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/aarch64-linux-gnu/10
Found candidate GCC installation: /usr/bin/../lib/gcc/aarch64-linux-gnu/8
Found candidate GCC installation: /usr/lib/gcc/aarch64-linux-gnu/10
Found candidate GCC installation: /usr/lib/gcc/aarch64-linux-gnu/8
Selected GCC installation: /usr/bin/../lib/gcc/aarch64-linux-gnu/10
Candidate multilib: .;@m64
Selected multilib: .;@m64

Build Configuration:

$ mkdir build && cd build
$ cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -GNinja ..
$ ninja
$ ninja test

Test Logs: libobjc2_ninja-verbose_03-07-2022.txt libobjc2_ninja-test-verbose_03-07-2022.txt libobjc2_AssociatedObject_legacy_lldb_03-07-2022.txt (With libc debug symbols)

davidchisnall commented 2 years ago

The crash that you're seeing is from your malloc detecting some corruption of its internal state, which typically means a use-after-free somewhere. Can you try with valgrind or ASan and see if they report something useful?

hmelder commented 2 years ago

LLDB and Valgrind

Configured CMake to build as debug and ran lldb and valgrind on the AssociatedObject_legacy test.

CMake Configuration:

$ cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug -GNinja ..

Logs: clang_libobjc2_lldb_aarch64_25-07-2022.txt clang_libobjc2_valgrind_aarch64_25-07-2022.txt

AddressSanitizer

The AddressSanitizer failed at __objc_load. Which seems to be too early.

CMake Configuration:

$ cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_FLAGS_DEBUG="-fsanitize=address" -GNinja ..

clang_libobjc2_asan-error_aarch64_25-07-2022.txt

davidchisnall commented 2 years ago

Thanks. Does the non-legacy version of the test also have valgrind failures? I'd like to drop support for the old (GCC-compatible) ABI at some point soon but I'm a bit surprised at the difference because I'd expect both to use the same code paths at this point.

hmelder commented 2 years ago

The non-legacy version does not abort, but I have not run valgrind on it. I'm currently packaging all major GNUstep components (next-generation clang-only) so this failed test is irrelevant, right? It is a burden to maintain the legacy API.

Thank you for all the nice and friendly support :)

davidchisnall commented 2 years ago

If you're compiling everything with -fobjc-runtime=gnustep-2.0 (please do!) then it should be fine, but I'm slightly concerned that the bug is present with both ABIs but just happens to trigger with the legacy test. Programs using the new ABI will have the data structures in the binaries used directly. Programs using the old ABI will have them replaced with new-ABI versions on the heap. This means that there will be more (unrelated) heap allocations and so it may be that we have a memory safety bug in both versions but it triggers the heap check only in the version that uses more.

Very few programs use the associated object logic and so it's entirely possible that this has been subtly broken for a long time and no one has noticed.

hmelder commented 2 years ago

Tested all AssociatedObject* non-legacy tests and everything was fine (Valgrind reported no issues aswell). I am always using gnustep-2.0.

One question regarding the aarch64 msgSend assembly: The code is currently ELF-only since relocations are not supported with COFF (Windows). What is a substitute for these ELF relocations (https://github.com/gnustep/libobjc2/blob/7dee32ad1e977ddf085aff70f0f0b1ffc3524ee6/objc_msgSend.aarch64.S#L83)?

davidchisnall commented 2 years ago

I'm not sure. The simplest thing to do is include the header file that defines SmallObjectClasses and compile a trivial C function that reads it and pass -S to clang. It will spit out the right assembly and you can just copy and paste it into the file. Please raise a PR once this is done, it would be great to have Windows/AArch64 support!