gnustep / libobjc2

Objective-C runtime library intended for use with Clang.
http://www.gnustep.org/
MIT License
426 stars 116 forks source link

Add support for fast-path alloc / init methods and direct methods. #268

Closed davidchisnall closed 5 months ago

davidchisnall commented 6 months ago

The fast paths follow the pattern that we established for fast ARC: Framework base classes can opt in by implementing a +_TrivialAllocInit method.

This opt-in behaviour is inherited and is removed implicitly in any subclass that implements alloc or init methods (alloc and init are treated independently).

Compilers can emit calls to objc_alloc(cls) instead of [cls alloc], objc_allocWithZone(cls) instead of [cls allocWithZone: NULL], and objc_alloc_init instead of [[cls alloc] init].

Direct methods don't require very much support in the runtime. Apple reuses their fast path for -self (which is supported only in the Apple fork of clang, not the upstream version) for a fast init. Given that the first few fields of the runtime's class structure have been stable for around 30 years, I'm happy moving the flags word (and the initialised bit, in particular) into the public ABI. This lets us do a fast-path check for whether a class is initialised in class methods and call objc_send_initialize if it isn't. This function is now exposed as part of the public ABI, it was there already and does the relevant checks without invoking any of the message-sending machinery.

Fixes #165 #169

davidchisnall commented 6 months ago

Requires this clang branch:

https://github.com/davidchisnall/llvm-project/tree/new-libobjc2-features

I'll raise that as an LLVM PR when this is merged.

hmelder commented 6 months ago

The new tests seem to fail

The non-legacy test-cases of this PR are passing.

A lot of general cases do not.

hmelder commented 6 months ago
(lldb) run
Process 30526 launched: '/Users/hugo/Source/github.com/gnustep/libobjc2/build/Test/FastARC' (aarch64)
Process 30526 stopped
* thread #1, name = 'FastARC', stop reason = signal SIGSEGV: invalid address (fault address: 0xffffff7ffff8)
    frame #0: 0x0000fffff7f8054c libobjc.so.4.6`isSmallObject(obj=<unavailable>) at class.h:417
   414  extern Class SmallObjectClasses[7];
   415
   416  static BOOL isSmallObject(id obj)
-> 417  {
   418      uintptr_t addr = ((uintptr_t)obj);
   419      return (addr & OBJC_SMALL_OBJECT_MASK) != 0;
   420  }

There seems to be an infinite loop:

A selected list of lldb frames ```gdb (lldb) frame select 1 frame #1: 0x0000fffff7f7f160 libobjc.so.4.6`retain(objc_object*, unsigned char) [inlined] isPersistentObject(obj=0x0000aaaaaaaf6b38) at arc.mm:291:6 288 return YES; 289 } 290 // Small objects are never accessibly by reference -> 291 if (isSmallObject(obj)) 292 { 293 return YES; 294 } (lldb) frame select 2 frame #2: 0x0000fffff7f7f140 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:302:6 299 300 static inline id retain(id obj, BOOL isWeak) 301 { -> 302 if (isPersistentObject(obj)) { return obj; } 303 Class cls = obj->isa; 304 if ((Class)&_NSConcreteMallocBlock == cls || 305 (Class)&_NSConcreteStackBlock == cls) (lldb) frame select 3 frame #3: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9 589 extern "C" OBJC_PUBLIC id objc_retain(id obj) 590 { 591 if (nil == obj) { return nil; } -> 592 return retain(obj, NO); 593 } 594 595 extern "C" OBJC_PUBLIC id objc_retainAutorelease(id obj) (lldb) frame select 4 frame #4: 0x0000fffff7f7f228 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:313:9 310 { 311 return retain_fast(obj, isWeak); 312 } -> 313 return [obj retain]; 314 } 315 316 extern "C" OBJC_PUBLIC BOOL objc_release_fast_no_destroy_np(id obj) (lldb) frame select 5 frame #5: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9 589 extern "C" OBJC_PUBLIC id objc_retain(id obj) 590 { 591 if (nil == obj) { return nil; } -> 592 return retain(obj, NO); 593 } 594 595 extern "C" OBJC_PUBLIC id objc_retainAutorelease(id obj) (lldb) frame select 6 frame #6: 0x0000fffff7f7f228 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:313:9 310 { 311 return retain_fast(obj, isWeak); 312 } -> 313 return [obj retain]; 314 } 315 316 extern "C" OBJC_PUBLIC BOOL objc_release_fast_no_destroy_np(id obj) [...] frame #18317: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9 frame #18318: 0x0000fffff7f7f228 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:313:9 frame #18319: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9 frame #18320: 0x0000fffff7f7f228 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:313:9 frame #18321: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9 frame #18322: 0x0000fffff7f7f228 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:313:9 frame #18323: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9 frame #18324: 0x0000fffff7f7f228 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:313:9 frame #18325: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9 ```

Same issue in RuntimeTest:

   Process 30554 launched: '/Users/hugo/Source/github.com/gnustep/libobjc2/build/Test/RuntimeTest' (aarch64)
testInvalidArguments() ran
testAMethod() ran
testAMethod() ran
testGetMethod() ran
testProtocols() ran
testMultiTypedSelector() ran
testClassHierarchy() ran
testAllocateClass() ran
Instance of NSObject: 0xaaaaaaafc6a8
Enter synchronized code
Process 30554 stopped
* thread #1, name = 'RuntimeTest', stop reason = signal SIGSEGV: invalid address (fault address: 0xffffff7ffff8)
    frame #0: 0x0000fffff7f8054c libobjc.so.4.6`isSmallObject(obj=<unavailable>) at class.h:417
   414  extern Class SmallObjectClasses[7];
   415
   416  static BOOL isSmallObject(id obj)
-> 417  {
   418      uintptr_t addr = ((uintptr_t)obj);
   419      return (addr & OBJC_SMALL_OBJECT_MASK) != 0;
   420  }
hmelder commented 6 months ago

Just tested it with clang main and the failures seem to be related to these changes (probably in clang):

hugo@libobjc2:/Users/hugo/Source/github.com/gnustep/libobjc2$ clang --version
clang version 18.0.0git (ssh://git@github.com/hmelder/llvm-project.git 360996ac5ad26714a6ddbee45730fbcfb7dc3eea)
98% tests passed, 4 tests failed out of 194

Total Test time (real) =  10.99 sec

The following tests did not run:
    159 - UnexpectedException (Skipped)
    160 - UnexpectedException_optimised (Skipped)
    161 - UnexpectedException_legacy (Skipped)
    162 - UnexpectedException_legacy_optimised (Skipped)

The following tests FAILED:
     51 - FastPathAlloc (Subprocess aborted)
     52 - FastPathAlloc_optimised (Subprocess aborted)
     53 - FastPathAlloc_legacy (Subprocess aborted)
     54 - FastPathAlloc_legacy_optimised (Subprocess aborted)

The failure of FastPathAlloc was expected.

davidchisnall commented 6 months ago

Can you show the output from the failing tests? Older clang should simply not call these functions, so I must have broken something internally.

triplef commented 6 months ago

This is great, thank you for implementing this! Optimizations like the fast path are very much relevant for our app, and I’m looking forward to being able to use direct methods to optimize hot paths in our codebase.

This opt-in behaviour is inherited and is removed implicitly in any subclass that implements alloc or init methods (alloc and init are treated independently).

Do I understand this correctly that we should be able to add + _TrivialAllocInit to NSObject and still have subclasses implementing their own alloc or init methods get them called as expected? If so I don’t believe this behavior is covered by the test cases yet.

davidchisnall commented 6 months ago

Do I understand this correctly that we should be able to add + _TrivialAllocInit to NSObject and still have subclasses implementing their own alloc or init methods get them called as expected?

Yes, exactly. Though very few classes will override +alloc. One that override -init will get the fast path for alloc in [[Cls alloc] init] and a code-size reduction at the call site for both.

If so I don’t believe this behavior is covered by the test cases yet

I didn't think that needed new tests because it doesn't change existing behaviour. Calls to [super alloc] or [super init] are not fast paths.

hmelder commented 6 months ago

Can you show the output from the failing tests?

Sure! For what ever reason, only FastARC and the FastPathAllloc_legacy tests seem to be failing...

The following tests FAILED:
     43 - FastARC (Subprocess aborted)
     44 - FastARC_optimised (Subprocess aborted)
     45 - FastARC_legacy (Subprocess aborted)
     46 - FastARC_legacy_optimised (Subprocess aborted)
     53 - FastPathAlloc_legacy (Subprocess aborted)
     54 - FastPathAlloc_legacy_optimised (Subprocess aborted)

This is with the same configuration and libobjc2 40fa8e3f10526edeba9a33caeff5af1af91edfcd. I am not sure why I had the test failures yesterday (I am now using a clang build with -DCMAKE_BUILD_TYPE=Release. I will check if I can reproduce it with the debug build).

hugo@libobjc2:/Users/hugo/Source/github.com/gnustep/libobjc2$ ./build/Test/FastARC
FastARC: /Users/hugo/Source/github.com/gnustep/libobjc2/Test/FastARC.m:80: int main(): Assertion `called == YES' failed.
Aborted

It seems like - retain was never called in the first FastARC test. The test uses the Test root class with + (void)_TrivialAllocInit{}.

davidchisnall commented 6 months ago

That’s confusing. I haven’t changed the situations when the fast arc flag is set or cleared, so I’m not sure why that test is failing.

hmelder commented 6 months ago

I can generate the LLVM IR for this test if this helps

davidchisnall commented 6 months ago

Well, the confusing thing is that this shouldn’t rely on anything in clang. It calls objc_retain explicitly, and somehow that’s detecting fast ARC is safe for the class, in spite of it not opting in.

hmelder commented 6 months ago

Here FastARC.ll if it helps :) FastARC.zip

It calls objc_retain explicitly

Yep. Weird.

hmelder commented 6 months ago

I am not sure why I had the test failures yesterday (I am now using a clang build with -DCMAKE_BUILD_TYPE=Release. I will check if I can reproduce it with the debug build).

Well turns out that configuring libobjc2 with -DCMAKE_BUILD_TYPE=Debug produces the list of test failures from yesterday. Building in release mode produces only the test failures mentioned above. I guess that cmake uses the debug build type by default.

Here a test with llvm-project @ 360996ac5ad26714a6ddbee45730fbcfb7dc3eea:

The following tests FAILED:
     51 - FastPathAlloc (Subprocess aborted)
     52 - FastPathAlloc_optimised (Subprocess aborted)
     53 - FastPathAlloc_legacy (Subprocess aborted)
     54 - FastPathAlloc_legacy_optimised (Subprocess aborted)

So this should be related to the clang changes :/

hmelder commented 5 months ago

I disabled shouldUseARCFunctionsForRetainRelease(), and now all tests (minus the legacy FastPathAlloc test) are passing. Seems like there is a problem with objc_retain, etc. codegen.

--- a/clang/include/clang/Basic/ObjCRuntime.h
+++ b/clang/include/clang/Basic/ObjCRuntime.h
@@ -211,7 +211,7 @@ public:
     case GCC:
       return false;
     case GNUstep:
-      return true;
+      return false;
     case ObjFW:
       return false;
     }
hmelder commented 5 months ago

https://github.com/gnustep/libobjc2/blob/65280908eb3227660be4e65430b08d15d9a414c6/arc.mm#L361-L380

[obj release] is replaced by objc_release resulting in an infinite recursion. So it seems like objc_class_flag_fast_arc is not set.

Patch

Why not call retain, release, and autorelease using objc_msgSend, to avoid replacement?

commit 3adbaaaa16aeafeaa5f09adf2f916f985e0412fe (HEAD -> direct-methods-changes)
Author: hmelder <service@hugomelder.com>
Date:   Sat Jan 13 02:20:56 2024 +0100

    ARC: Call retain, release, autorelease using objc_msgSend directly

diff --git a/arc.mm b/arc.mm
index 5a4cc89..3c53e44 100644
--- a/arc.mm
+++ b/arc.mm
@@ -310,7 +310,7 @@ static inline id retain(id obj, BOOL isWeak)
    {
        return retain_fast(obj, isWeak);
    }
-   return [obj retain];
+   return objc_msgSend(obj, @selector(retain));
 }

 extern "C" OBJC_PUBLIC BOOL objc_release_fast_no_destroy_np(id obj)
@@ -376,7 +376,7 @@ static inline void release(id obj)
        objc_release_fast_np(obj);
        return;
    }
-   [obj release];
+   objc_msgSend(obj, @selector(release));
 }

 static inline void initAutorelease(void)
@@ -436,7 +436,7 @@ static inline id autorelease(id obj)
        }
        return obj;
    }
-   return [obj autorelease];
+   return objc_msgSend(obj, @selector(autorelease));
 }

 extern "C" OBJC_PUBLIC unsigned long objc_arc_autorelease_count_np(void)
hmelder commented 5 months ago

And for the failing legacy tests:

commit 75b9dd39b2b4e4ff5750f511c6c60f504ce6a933 (HEAD -> direct-methods-changes)
Author: hmelder <service@hugomelder.com>
Date:   Sat Jan 13 02:25:22 2024 +0100

    Move non-legacy unit tests into NEW_TESTS

diff --git a/Test/CMakeLists.txt b/Test/CMakeLists.txt
index fc190df..7743cdc 100644
--- a/Test/CMakeLists.txt
+++ b/Test/CMakeLists.txt
@@ -20,11 +20,9 @@ set(TESTS
    BlockTest_arc.m
    ConstantString.m
    Category.m
-   DirectMethods.m
    ExceptionTest.m
    FastARC.m
    FastARCPool.m
-   FastPathAlloc.m
    FastRefCount.m
    Forward.m
    ManyManySelectors.m
@@ -86,6 +84,8 @@ endif()
 # shouldn't be run in legacy mode.
 set(NEW_TESTS
    category_properties.m
+   DirectMethods.m
+   FastPathAlloc.m
 )

 remove_definitions(-D__OBJC_RUNTIME_INTERNAL__=1)
davidchisnall commented 5 months ago

Thanks. Well also need the fallback path for platforms with no assembly, but that’s easy for me to add.

davidchisnall commented 5 months ago

I believe this is now all fixed.

davidchisnall commented 5 months ago

LLVM PR is here:

https://github.com/llvm/llvm-project/pull/78030

Please test!

hmelder commented 5 months ago

I believe this is now all fixed.

Great! Thank you for implementing this :)

davidchisnall commented 5 months ago

Can someone run the -base test suite with this and the clang changes?

hmelder commented 5 months ago

Can someone run the -base test suite with this and the clang changes?

   9643 Passed tests
     34 Dashed hopes
      1 Skipped set
      0 Failed set
davidchisnall commented 5 months ago

Do we see any noticeable code size change between the two (for -base, with the old and new compilers)?

hmelder commented 5 months ago

I didn't know that libgnustep-base.so is this big.

We have shaved about ~2MB off the shared library.

Stat

Clang with changes from PR:

hugo@libobjc2:~/libs-base$ stat /usr/local/lib/libgnustep-base.so.1.29.0
  File: /usr/local/lib/libgnustep-base.so.1.29.0
  Size: 12693576    Blocks: 24776      IO Block: 4096   regular file
Device: 24h/36d Inode: 1635115     Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2024-01-13 21:45:03.466030044 +0100
Modify: 2024-01-13 21:45:03.484030115 +0100
Change: 2024-01-14 11:11:39.554747674 +0100
 Birth: 2024-01-14 11:11:39.542747847 +0100

Clang 14:

hugo@libobjc2:~/libs-base$ stat /usr/local/lib/libgnustep-base.so.1.29.0
  File: /usr/local/lib/libgnustep-base.so.1.29.0
  Size: 14734176    Blocks: 28768      IO Block: 4096   regular file
Device: 24h/36d Inode: 1640092     Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2024-01-14 11:19:37.063676199 +0100
Modify: 2024-01-14 11:19:37.091675786 +0100
Change: 2024-01-14 11:20:10.209187899 +0100
 Birth: 2024-01-14 11:20:10.204187972 +0100

Configuration

GNUstep Make

export DEB_HOST_MULTIARCH=aarch64-linux-gnu
./configure  --with-layout=debian \
        --enable-native-objc-exceptions \
        --enable-objc-arc \
        --prefix=/usr \
        --with-runtime-abi=gnustep-2.2 \
        --with-library-combo=ng-gnu-gnu \
    CC="clang" CXX="clang++" CPP="clang -E" LDFLAGS="-fuse-ld=lld -L/usr/lib/$(DEB_HOST_MULTIARCH)" SHELLPROG=/bin/bash GNUMAKE=make

GNUstep Base

./configure
davidchisnall commented 5 months ago

Looks like roughly a 14% reduction. Very nice to have. I'll add this to the release notes.