A crash can occur during saving if there happen to be two baselines

drewmccormack commented 10 years ago

The crash happens for this stack:

11 com.apple.CoreData 0x00007fff8790dbc9 -[NSManagedObjectContext performBlockAndWait:] + 121 12 com.mentalfaculty.ensembles 0x000000010081e8bd -[CDEEventBuilder makeNewEventOfType:] + 253 13 com.mentalfaculty.ensembles 0x0000000100888666 -[CDESaveMonitor storeChangesForContext:changedObjectsDictionary:] + 518 14 com.mentalfaculty.ensembles 0x0000000100888200 -[CDESaveMonitor contextDidSave:] + 624

The method where the crash happens is fetchBaselineStoreModificationEventInManagedObjectContext. It throws an exception due to an NSAssert

Need to find a way to handle this. Options would include:

Doing a baseline consolidation on the spot. Could take a while.
Determining the newest baseline and using that. Probably easiest option.

ddeville commented 10 years ago

This seems to be the same crash I mentioned at NSConf (failed assertion regarding the expected number of baseline files in +[CDEStoreModificationEvent fetchStoreModificationEventWithAllowedTypes:persistentStoreIdentifier:revisionNumber:inManagedObjectContext:]).

I’ve been able to get it fairly consistently but haven’t identified the cause yet.

I’ve actually been debugging it a bit this morning and it seems to happen when I switch from a store with Ensembles to a store without back and forth. From an initial look at it, it looks like my CDEPersistentStoreEnsemble instance is not deallocated (it doesn’t seem to be leaked on my end but it might well be). I guess I end up with 2 ensembles instance for the same Ensemble which would explain how things get messed up.

I will do more debugging and see if I can help you with this.

drewmccormack commented 10 years ago

You should deleech if you no longer want to sync anymore. That removes local caches etc. If you end up with two ensembles for the same store, they will definitely interfere with one another.

If you want to be sure that the pipeline of an ensemble is empty, you can try the processPendingChanges... method on the ensemble.

ddeville commented 10 years ago

Oh I see.

In my case I am only changing to a completely different persistent store on disk (one that is not using Ensembles) in the same launch of the application, using the same persistent store coordinator.

I didn’t initially want to deleech the Ensembles store since I don’t technically want to stop syncing, I simply want to switch away from that store for a period of time.

I guess it makes sense to process the changes and deleech it before switching store though. Will try out.

drewmccormack commented 10 years ago

Ah, I see what you mean now. No, you shouldn't have to deleech. I thought you were trying to setup a new ensemble.

So this is an interesting case I hadn't really considered closely. Ensembles will see the save to the 'other' store, and will see that the saving context has access to the syncing store, and will then generate a save event, which presumably would end up empty since you are not actually saving into the synced store.

I suspect the reason for the crash is that when this save happens, there is a merge underway, and there is momentarily 2 baselines in the event store. The merge would typically reduce these to 1, but if a save happened to happen just at the wrong time, you could get the crash described.

I think the solution is just to make the fetch method get all baselines, and just return the most recent, rather than raising an exception. I don't think there is really an issue here. We should just accept that there might be multiple baselines at various times, and handle that.

drewmccormack commented 10 years ago

I've pushed a change to the baseline fetch routine which I think should fix this issue. Let me know if it doesn't.

ddeville commented 10 years ago

This definitely fixes the multiple baselines failed assertion!

I’ve also finally managed to figure out the cause of the issue I was having earlier. Basically, the CDEPersistentStoreEnsemble instance deallocation will only happen after all the work enqueued on its operation queue has completed (since most of the tasks capture self, logically keeping the Ensembles alive for as long as the task queue is running). If the Ensembles is deleeched, all operations are cancelled which means deallocation should happen by the time the completion block is invoked (the queue is serial, so assuming no race where a merge could be started while the Ensembles is being deleeched, the queue should be empty by the time the completion block is invoked). If the Ensembles instance is simply released by its owner it will hang around for some time until all that work has completed.

When switching my synced persistent store off and on, I would recreate a new CDEPersistentStoreEnsemble instance for the persistent store location. I would however frequently run into the situation where the previous instances were not deallocated yet when the new one was created. That’s how I ended up with multiple instances, all creating events for the same persistent store which would consequently lead to a bunch of failed assertions in the various fetchStoreModification… methods.

Since I didn’t want to leech/deleech the store every time it was mounted, I’ve changed it to simply reuse the Ensembles instance for a given store rather than recreating a new one. It seems to be working very nicely so far so all good :)

Thanks for your help!

drewmccormack commented 10 years ago

Sounds good. Note also that there is a processPendingChanges... method on the ensemble that you could use to 'wait' for it to flush its pipeline.

Drew

On 02 Apr 2014, at 17:45, Damien DeVille notifications@github.com wrote:

This definitely fixes the multiple baselines failed assertion!

I’ve also finally managed to figure out the cause of the issue I was having earlier. Basically, the CDEPersistentStoreEnsemble instance deallocation will only happen after all the work enqueued on its operation queue has completed (since most of the tasks capture self, logically keeping the Ensembles alive for as long as the task queue is running). If the Ensembles is deleeched, all operations are cancelled which means deallocation should happen by the time the completion block is invoked (the queue is serial, so assuming no race where a merge could be started while the Ensembles is being deleeched, the queue should be empty by the time the completion block is invoked). If the Ensembles instance is simply released by its owner it will hang around for some time until all that work has completed.

When switching my synced persistent store off and on, I would recreate a new CDEPersistentStoreEnsemble instance for the persistent store location. I would however frequently run into the situation where the previous instances were not deallocated yet when the new one was created. That’s how I ended up with multiple instances, all creating events for the same persistent store which would consequently lead to a bunch of failed assertions in the various fetchStoreModification… methods.

Since I didn’t want to leech/deleech the store every time it was mounted, I’ve changed it to simply reuse the Ensembles instance for a given store rather than recreating a new one. It seems to be working very nicely so far so all good :)

Thanks for your help!

— Reply to this email directly or view it on GitHub.

drewmccormack commented 10 years ago

I'm not quite sure I understand the sync off/on setup you are trying to achieve, but I want to give you a warning: If any changes are going to be made to the persistent store data 'offline', while the ensemble is not monitoring it, you should deleech. You should not modify a store when there is no ensemble. If you want to do that, you should first deleech, then leech later. When you leech, it reimports all the local data again, but at least you are sure you won't lose any changes made while the ensemble was not watching.

Hopefully that is clear. I'm not sure it relates to your setup, but thought I should point it out. In summary: only ever make changes to a stores data when the leeched ensemble object exists. Otherwise, you should deleech, or you may lose data because the ensemble will miss some changes.

Drew

On 02 Apr 2014, at 17:45, Damien DeVille notifications@github.com wrote:

This definitely fixes the multiple baselines failed assertion!

I’ve also finally managed to figure out the cause of the issue I was having earlier. Basically, the CDEPersistentStoreEnsemble instance deallocation will only happen after all the work enqueued on its operation queue has completed (since most of the tasks capture self, logically keeping the Ensembles alive for as long as the task queue is running). If the Ensembles is deleeched, all operations are cancelled which means deallocation should happen by the time the completion block is invoked (the queue is serial, so assuming no race where a merge could be started while the Ensembles is being deleeched, the queue should be empty by the time the completion block is invoked). If the Ensembles instance is simply released by its owner it will hang around for some time until all that work has completed.

When switching my synced persistent store off and on, I would recreate a new CDEPersistentStoreEnsemble instance for the persistent store location. I would however frequently run into the situation where the previous instances were not deallocated yet when the new one was created. That’s how I ended up with multiple instances, all creating events for the same persistent store which would consequently lead to a bunch of failed assertions in the various fetchStoreModification… methods.

Since I didn’t want to leech/deleech the store every time it was mounted, I’ve changed it to simply reuse the Ensembles instance for a given store rather than recreating a new one. It seems to be working very nicely so far so all good :)

Thanks for your help!

— Reply to this email directly or view it on GitHub.

ddeville commented 10 years ago

Thanks for the heads up.

Basically, I’m using two very distinct stores on disk for Local and iCloud. So yeah, Ensembles is always on (and leeched) when the iCloud persistent store is mounted.

So the user can choose to view her local library or her library in iCloud instead of having one unified library that can have sync turned on or off.

drewmccormack / ensembles

A crash can occur during saving if there happen to be two baselines #126