gnustep / libobjc2

Objective-C runtime library intended for use with Clang.
http://www.gnustep.org/
MIT License
426 stars 116 forks source link

Crash when accessing weak reference set in dealloc when using ARC #214

Closed triplef closed 2 years ago

triplef commented 2 years ago

Using ARC, accessing a weak reference that has been set to a reference to a deallocating object in its dealloc method crashes with the following backtrace (tested on Windows x64_64, not reproducible on Android x64_64):

Exception thrown at 0x00007FFBF0B7AC70 (objc.dll) in app.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFFF.

 1      objc.dll!objc_test_class_flag(objc_class * aClass=0xdddddddddddddddd, objc_class_flags flag=objc_class_flag_permanent_instances) class.h:384
 2      objc.dll!objc_loadWeakRetained(objc_object * * addr=0x00007ff6c5476008) arc.mm:912

Reproduction:

static __weak id weakRef;

@interface TestObjectCrash : NSObject
@end

@implementation TestObjectCrash
- (void)dealloc
{
  weakRef = self;
}
@end

void main() {
  {
    [[TestObjectCrash alloc] init]; // will be immediately released
  }
  id obj = weakRef; // CRASH
  NSLog(@"obj: %@", obj);
}

@Graham--M could your changes in #200 and #204 possibly related to this?

davidchisnall commented 2 years ago

This looks interesting, can you raise a PR adding it as a test case? If I understand correctly the sequence is:

  1. Deallocation starts, the object is marked as deallocating.
  2. The -dealloc method is invoked.
  3. The object is stored into a weak reference.
  4. The deallocation finishes and the object is destroyed.
  5. The weak reference is loaded.

I believe the bug is in step 3: storing a deallocating object into a weak reference should probably store nil. The fix should be a check in the store-weak implementation to check that the object is not deallocating.

Graham--M commented 2 years ago

@Graham--M could your changes in #200 and #204 possibly related to this?

That bug exists prior to my commits but I am kicking myself for not noticing that given how similar it was to the block capture bug.

It is fixed by changing the if statements guarding weak reference incrementing in objc_initWeak() and objc_storeWeak() to:

if (nil != obj && object_getRetainCount_np(obj))

Graham--M commented 2 years ago

It is fixed by changing the if statements guarding weak reference incrementing in objc_initWeak() and objc_storeWeak() to: if (nil != obj && object_getRetainCount_np(obj))

Thinking about it, that also would mean that you cannot load from the weak reference when you are still in the dealloc method. Maybe deleting the weak references has to happen during object_dispose?

Edit: Never mind. I wasn't sure what the specification in clang even meant by 'begun deallocation' so I tried the original test case on OS X and the Apple runtime won't let you take a weak reference at all:

objc[399]: Cannot form weak reference to instance (0x7f8cc1408720) of class TestObjectCrash. It is possible that this object was over-released, or is in the process of deallocation. Illegal instruction: 4

triplef commented 2 years ago

Thanks for your thoughts @Graham--M! I tried your original suggestion, but it makes this test fail:

 82/190 Test  #85: ResurrectInDealloc_arc ............................***Exception: SegFault  0.20 sec
Test running?
Killing strong
strong self: 0x196afa8
Killing weak
weak self: 0x196c110
Graham--M commented 2 years ago

I was just about to post that I put the condition in the wrong place because those two functions need to return null according to the clang specification. However the fix I was about to suggest triggers a failure in ARCTest_arc so I need to see why that fails.

Graham--M commented 2 years ago

The problem I was hitting was because I forgot about constant strings not having a ref count and I was trying to avoid setting the weak ref flag if it wasn't already set, but that doesn't matter much because the weak reference deletion has already happened before we call dealloc.

Directly before the if for the increment statements you can do:

if (!object_getRetainCount_np(obj))
{
    *addr = nil;
    return nil;
}

and that should fix the issues with the test failure you had.

triplef commented 2 years ago

Thanks @Graham--M, that seems to have done the trick! See #215.