Open colinmorelli opened 7 years ago
Perhaps a less-risky approach could be to optionally expose a separate main queue context for NSFRC
s to be created from, keeping the existing structure as-is.
Hi, thanks for the proposal :)
The setup you propose is effectively similar to the one CoreStore uses (see implementation below) because CoreStore enforces a read-only main context, which means its parent-child relationship with the background saving context is one-way: it's only used for fetching. Theoretically, there should be an advantage to the current setup as objects already in the parent context's memory should not be fetched from disk again. (Although, this behavior is most likely the cause of the FRC bug you mentioned).
The change you propose actually just needs a few lines to implement:
// NSManagedObjectContext+Setup.swift
@nonobjc
internal static func mainContextForRootContext(_ rootContext: NSManagedObjectContext) -> NSManagedObjectContext {
let context = NSManagedObjectContext(concurrencyType: .mainQueueConcurrencyType)
// context.parent = rootContext // current
context.persistentStoreCoordinator = rootContext.persistentStoreCoordinator! // proposed
// DataStack.swift
public required init(model: NSManagedObjectModel, migrationChain: MigrationChain = nil) {
// ...
self.rootSavingContext.parentStack = self
self.mainContext.parentStack = self // proposed
self.mainContext.isDataStackContext = true
}
You can go ahead and test benchmarks if you wish. In any case, I'll need to run this change first in a few test runs because I worry this may affect deadlock frequency (or maybe reduce it, I'm not sure)
@colinmorelli Any performance data / feedback would be great as we're also running into this issue.
@colinmorelli @MattNewberry I prepared prototype/mainContextToPSC
branch we can use to test this change. The unit test errors can be ignored, they are caused by ambiguous configurations which are not allowed in the first place.
@JohnEstropia Thanks, we'll check it out. In our case, we're not using CoreStore as we've long rolled our own stack. That said, we handle very large initial data set downloads, and have tried to shift these downloads to the background so our users aren't stuck at a loading screen. However, as @colinmorelli spoke about as well, we've had issue with merging background changes into the main thread, and largely believe this is due to NSFetchedResultsController
delegate methods constantly re-firing and reloading its entire datasets per save. We import in batches of 1,000, and have yet to come up with an architecture that facilitates speedy background -> main thread merges.
@colinmorelli Hi, did you get the chance to test this change? So far there doesn't seem to be any effects (good or bad) on the apps we work on. Our use case may be limited so I hope others can try this out as well.
I started testing this change and can confirm that PSC -> Main Context Core Data stack setup prevents the loading all objects directly in memory at the time the FRC is instantiated.
More details over time.
@ruslanskorb Thanks! I really appreciate all inputs about this change. I haven't seen any problem with this so far as well. If it proves stable I'll include this in the CoreStore 4.0 milestone.
I found two problems at the moment. The more the number of fetched objects, the more noticeable each of them. Each of these problems belongs to ListMonitor
.
1) When the number of fetched objects more than 4-6k there are noticeable lags when accessing a particular fetched object. This PR fixes it.
2) After saving rootSavingContext
there are noticeable lags when accessing a particular fetched object. It looks like NSFetchedResultsController
works incorrectly with the changed mainContext
. To fix it we can recreate ListMonitor
and everything will work fine again.
@ruslanskorb Thanks for trying this out!
When the number of fetched objects more than 4-6k there are noticeable lags when accessing a particular fetched object. This PR fixes it.
Thanks for the PR! I merged #157. It looks like we can use the same optimization in other places.
After saving rootSavingContext there are noticeable lags when accessing a particular fetched object. It looks like NSFetchedResultsController works incorrectly with the changed mainContext. To fix it we can recreate ListMonitor and everything will work fine again.
Does this happen with the develop branch as well?
Does this happen with the develop branch as well?
No, this only happens when Main Context is connected directly to PSC.
@ruslanskorb I have a hunch that this is caused by the NSFetchedResultsController workaround done during merging. Are you saving transactions asynchronously or synchronously?
@JohnEstropia It works the same even if we replace observerForDidSaveNotification
by automaticallyMergesChangesFromParent
(available on iOS 10).
And, I tested saving transactions asynchronously.
@ruslanskorb Thanks for the info! I'm pretty sure that those lags are caused by this.
In theory, NSManagedObjectContext.object(with:)
is designed to fetch updated objects if the main context don't have them yet (so it can let NSFetchedResultsControllers process diffs), but "fetch" here means traversing the parent tree for those objects' info. In our case, the parent is the NSPersistentStoreCoordinator, and any fetch from there will trigger locking.
iOS 10 was supposed to improve this by moving those locks to SQLite-level. But it turns out that concurrency on those locks have a limited pool size so your use case of thousands of objects are causing hiccups.
In any case, I think this stack rearchitecture is not feasible yet. With regards to NSFRCs, lags are harder to manage than memory usage, so I think it's better to err on the previous behavior.
@colinmorelli @ruslanskorb @MattNewberry Thank you for your input on this. If there are no other observations/solutions I'm closing this issue in a couple of days. Feel free to comment anytime.
@JohnEstropia In my previous comment I meant that NSManagedObjectContext.object(with:)
call in NSManagedObjectContext+Setup
does not cause those lags. I tried to comment out this code but nothing has changed. After that I tried to comment out observerForDidSaveNotification
and set automaticallyMergesChangesFromParent
to true
. And in this case also nothing has changed. Do you mean that NSManagedObjectContext.object(with:)
is called by NSFRC internally?
As I can see in Instruments after merging when we are accessing an object in NSFRC it performs a heavy request to the database. And so for each object. Even for the one to which we just accessed. It looks like NSFRC ignores cached objects after merging.
If we recreate NSFRC it will again perform only lightweight requests to the database.
Also I noticed a bunch of random memory crashes on iOS 9 with PSC -> Main Context Core Data stack setup. Recreating NSFRC in listMonitorDidChange(_:)
also fixes them.
After that I tried to comment out observerForDidSaveNotification and set automaticallyMergesChangesFromParent to true. And in this case also nothing has changed. Do you mean that NSManagedObjectContext.object(with:) is called by NSFRC internally?
I'm inclined to think so. Actually I don't see how NSFRC will find updates to its objects without any sort of fetching. Let's suppose that 100 objects satisfy an NSFRC's predicate but it initially fetches only fetchBatchSize = 20
in memory. If an external transaction updates any object from those 80, the NSFRC needs to know if the 20 it knows should be the same 20 it needs to keep (sort keys could have been updated for example). The only way to get those info is to fetch on the main queue (because of NSMainQueueConcurrencyType) from the NSPSC (which locks the store) .
On the original CoreStore stack, the context can get those info from the cached objects from it's parent root context. If a fetch is required, the fetch runs from the root context, which is on a background queue (NSPrivateQueueConcurrencyType), so the only danger of lags are from NSPSC locks.
(Please correct me if anything about how I understood this works is wrong)
As I can see in Instruments after merging when we are accessing an object in NSFRC it performs a heavy request to the database. And so for each object. Even for the one to which we just accessed. It looks like NSFRC ignores cached objects after merging.
I was wondering why after a month of using this branch we didn't notice any effects in our debug builds. Then I realized that we fetch incremental data from our server only 20~50 records at a time, probably not enough to push limits of the SQLite locking pools. Anyway, for me I/O will generally be a bigger performance cost compared to memory usage (that's what caches are for after all) so I think we'd rather keep the original stack setup.
@ruslanskorb Thanks again for the detailed investigation :)
I'm here again :-]
I'm inclined to think so. Actually I don't see how NSFRC will find updates to its objects without any sort of fetching.
NSFRC gets those updates from NSManagedObjectContextDidSave
notification.
In most of cases this information is enough to update the fetched objects and their order.
If you add -com.apple.CoreData.SQLDebug 1
to Arguments Passed on Launch you will not see any sort of fetching from the database when updating objects.
I was wondering why after a month of using this branch we didn't notice any effects in our debug builds.
I do not see the same problems in CoreStoreDemo project. Probably, there is a dependence on the complexity of the database schema.
In any case, it would be nice to be able to choose one of the above discussed configurations in the next version of CoreStore when initializing the stack.
@ruslanskorb Thank you for taking time to test this :) Sorry I'm not sure I understood which side you were explaining. All your observations are from the prototype/mainContextToPSC
branch, is that correct?
In your previous post you have mentioned:
As I can see in Instruments after merging when we are accessing an object in NSFRC it performs a heavy request to the database. And so for each object. Even for the one to which we just accessed. It looks like NSFRC ignores cached objects after merging.
But you mentioned that
If you add -com.apple.CoreData.SQLDebug 1 to Arguments Passed on Launch you will not see any sort of fetching from the database when updating objects.
so I'm a little confused where the performance drops.
I do not see the same problems in CoreStoreDemo project. Probably, there is a dependence on the complexity of the database schema.
Yeah, the CoreStoreDemo app doesn't have a demo for multiple, reasonably sized imports (sorry).
In any case, it would be nice to be able to choose one of the above discussed configurations in the next version of CoreStore when initializing the stack.
Heavy lags on each update defeats the purpose of changing the stack structure for performance, and I feel the degradation more than outweighs the benefit (fixing the fetchBatchSize
bug). Are you experiencing a point where NSFRC's memory usage reaches intolerable levels?
All your observations are from the
prototype/mainContextToPSC
branch, is that correct?
Yes, it is.
As I can see in Instruments after merging when we are accessing an object in NSFRC it performs a heavy request to the database. And so for each object. Even for the one to which we just accessed. It looks like NSFRC ignores cached objects after merging.
I see this behavior in a real application with a complex data model.
If you add -com.apple.CoreData.SQLDebug 1 to Arguments Passed on Launch you will not see any sort of fetching from the database when updating objects.
At the same time, I see a different behavior in the test application with a simple data model.
Heavy lags on each update defeats the purpose of changing the stack structure for performance
These lags are easily eliminated by re-creating the NSFRC.
Are you experiencing a point where NSFRC's memory usage reaches intolerable levels?
At the moment, no :-] But, the level of memory usage affects the overall performance of the application. And when you have 100 000 objects in memory it is noticeable.
Anyway, there are other cases when NSManagedObjectContext
will load all objects in memory at the same time, not only when NSFRC is instantiated.
CoreStore is currently, to my knowledge, using a fairly common Core Data stack setup of:
PSC -> Background Saving Context -> Main Context
I'm proposing a change (or investigation) into:
PSC -> Background Saving Context PSC -> Main Context
I've got two primary reasons for this:
Performance If this post is to be trusted, it looks like you can score considerably improved performance using this layout. At the very least, I can't imagine it will cost any performance, but investigation is warranted here.
Fixing Core Data Bugs There's an fairly significant and incredibly annoying that has been in Core Data for who knows how long.
NSFetchedResultsControllers
which are managed by a context that is not directly connected to a PSC will completely ignorefetchBatchSize
and load all objects directly in memory at the time the FRC is instantiated.This can be avoided if the main context is attached directly to the PSC, giving back some considerably performance benefits to
NSFetchedResultsControllers
that are fronting large data sets.