Deadlock nightmare! - Githubissues

hartbit commented 10 years ago

Hi there! I'm using HEAD, and I'm experiencing a deadlock. Care to have a look?

Stack trace for Thread 1

#0  0x030737ca in __psynch_cvwait ()
#1  0x03038d1d in _pthread_cond_wait ()
#2  0x0303abd9 in pthread_cond_wait$UNIX2003 ()
#3  0x0094821b in -[__NSOperationInternal _waitUntilFinished:] ()
#4  0x0086f1c8 in -[NSOperation waitUntilFinished] ()
#5  0x008775ea in -[NSOperationQueue addOperations:waitUntilFinished:] ()
#6  0x000ef7f5 in -[FCModelDatabaseQueue execOnSelfSync:] at /Users/david/Development/geranium.iosapp/Pods/FCModel/FCModel/FCModelDatabaseQueue.m:56
#7  0x000efb43 in -[FCModelDatabaseQueue inDatabase:] at /Users/david/Development/geranium.iosapp/Pods/FCModel/FCModel/FCModelDatabaseQueue.m:75
#8  0x000dee9b in +[FCModel instanceFromDatabaseWithPrimaryKey:] at /Users/david/Development/geranium.iosapp/Pods/FCModel/FCModel/FCModel.m:189
#9  0x000de639 in +[FCModel instanceWithPrimaryKey:databaseRowValues:createIfNonexistent:] at /Users/david/Development/geranium.iosapp/Pods/FCModel/FCModel/FCModel.m:140
#10 0x000de226 in +[FCModel instanceWithPrimaryKey:] at /Users/david/Development/geranium.iosapp/Pods/FCModel/FCModel/FCModel.m:121

Stack trace for Thread 3

#0  0x03073802 in __psynch_mutexwait ()
#1  0x03039945 in _pthread_mutex_lock ()
#2  0x030397ac in pthread_mutex_lock ()
#3  0x0087db0d in -[NSRecursiveLock lock] ()
#4  0x008b9b28 in -[NSObject(NSKeyValueObserverRegistration) addObserver:forKeyPath:options:context:] ()
#5  0x000e66ab in __55-[FCModel initWithFieldValues:existsInDatabaseAlready:]_block_invoke at /Users/david/Development/geranium.iosapp/Pods/FCModel/FCModel/FCModel.m:646
#6  0x02a7da7a in __65-[__NSDictionaryM enumerateKeysAndObjectsWithOptions:usingBlock:]_block_invoke ()
#7  0x02a7d97e in -[__NSDictionaryM enumerateKeysAndObjectsWithOptions:usingBlock:] ()
#8  0x029e2125 in -[NSDictionary enumerateKeysAndObjectsUsingBlock:] ()
#9  0x000e5d80 in -[FCModel initWithFieldValues:existsInDatabaseAlready:] at /Users/david/Development/geranium.iosapp/Pods/FCModel/FCModel/FCModel.m:609
#10 0x000df19b in __46+[FCModel instanceFromDatabaseWithPrimaryKey:]_block_invoke at /Users/david/Development/geranium.iosapp/Pods/FCModel/FCModel/FCModel.m:192
#11 0x000efc3c in __35-[FCModelDatabaseQueue inDatabase:]_block_invoke at /Users/david/Development/geranium.iosapp/Pods/FCModel/FCModel/FCModelDatabaseQueue.m:78
#12 0x000ef062 in -[FCModelDatabaseQueueOperation main] at /Users/david/Development/geranium.iosapp/Pods/FCModel/FCModel/FCModelDatabaseQueue.m:17
#13 0x00948c79 in -[__NSOperationInternal _start:] ()
#14 0x008c59c8 in -[NSOperation start] ()
#15 0x0094af44 in __NSOQSchedule_f ()
#16 0x02d084d0 in _dispatch_client_callout ()
#17 0x02cf6047 in _dispatch_queue_drain ()
#18 0x02cf5e42 in _dispatch_queue_invoke ()
#19 0x02cf6de2 in _dispatch_root_queue_drain ()
#20 0x02cf7127 in _dispatch_worker_thread2 ()
#21 0x03037dab in _pthread_wqthread ()

marcoarment commented 10 years ago

I'm curious about that [NSRecursiveLock lock] call from the KVO implementation on thread 3. It looks like KVO takes a lock there, but that's a different lock than the blocked Thread 1 queue.

On Thread 1, what comes next in the stack after level 10? Is instanceWithPrimaryKey: being invoked from another KVO call, maybe, and the internal KVO lock is deadlocking with the recursive calls?

marcoarment commented 10 years ago

In the absence of more information, I'm going to close this for now — please comment or reopen if necessary. Thanks.

hartbit commented 10 years ago

I had not thought about a KVO-centric lock. And indeed, the stack trace on Thread 1 contains a `addObserver' method call, which ends up calling the property, which ends up doing the DB call.

What do you think about this problem? Is it an issue that should just be well documented, or do you think that FCModel can do something about it so it does not happen?

marcoarment commented 10 years ago

Well, glad we figured it out. I'm not really sure what to do about it, though. (I'm open to suggestions.)

The weird thing is that, looking at that KVO source code (assuming that's still how it's implemented), it looks like every NSObject gets its own lock (iLock there). So the only way this should be a problem is if addObserver: is recursively called on the same object, but how is that possible with FCModel?

To narrow it down a bit, could you clarify what object the enclosing addObserver call was observing, which FCModel was invoked for instanceWithPrimaryKey:, and anything else that may be relevant about its context?

hartbit commented 10 years ago

But looking at the stack trace, the DB thread shows the use of [NSRecursiveLock lock] which is different from the NSLock* iLock. I'll get more info for you once I get to work tomorow.

If you wanted to fix this, you'd need your KVO call to happen after the instanceWithPrimaryKey call has returned. Perhaps if you setup the call with an async block on the same queue?

marcoarment commented 10 years ago

iLock is set as an NSRecursiveLock in initWithInstance.

If I dispatch my KVO calls to happen after that, I believe immediately trying to update a just-loaded model would fail:

Person *p = [Person instanceWithPrimaryKey:@(1)]; // already exists in DB
p.name = @"New Name";
[p save];

Attaching the internal KVO listeners would be delayed until the next run loop, so p would have no idea that .name had changed, save would do nothing and return FCModelSaveNoChanges, and the database row would not be updated.

Another option would be to rewrite FCModel's change-tracking not to use KVO at all, probably by keeping a dictionary of the originally loaded values on each model and checking them for equality with the current values on save. This isn't necessarily worse than the KVO method, just different: it would use a bit more memory for each property, saves would incur a slight performance penalty, and didChangeValueForFieldName would need to be removed, but changing the model properties would be faster since the KVO logic wouldn't be there.

marcoarment commented 10 years ago

The more I think about rewriting the change-tracking not to use KVO, the better I think that option is. What do you think?

marcoarment commented 10 years ago

Check out commit 3ea84d7 on the new remove-internal-kvo branch. (And feel free to try it at work where this deadlock was happening. Shouldn't happen anymore.)

hartbit commented 10 years ago

Works like a charm!

I also think that the solution you went for is superior to KVO!

marcoarment commented 10 years ago

I agree. I think I'll finish it up today and merge it into master.

marcoarment / FCModel

Deadlock nightmare! #42

Stack trace for Thread 1

Stack trace for Thread 3