magicalpanda / MagicalRecord

Super Awesome Easy Fetching for Core Data!
Other
10.79k stars 1.79k forks source link

find a large number of entities will very very slow. #876

Open xujin opened 10 years ago

xujin commented 10 years ago

This is so slow when i was traversal a large of number entities.

tonyarnold commented 10 years ago

Hi @xujin — could you provide some (any) details of how you're traversing the entities? Sample code or a small sample project would help me get to the bottom of this much faster.

yorkwang commented 10 years ago

I met the same problem, my app has several PersistentStores. It means only one store can be set as default, then MR_contextWithStoreCoordinator has to been used when searching from other DB stores. But the test result shows searching by MR_contextWithStoreCoordinator is very slow, and sometimes it take more than 5s in one DB which contains about 10K entities. My android version of this app use Green Dao, same database, it only takes about 0.5s when searching all the DBs even in low-end phone, 1~3s for importing about 10K entities. iOS version need about 5~15s for importing. It depends on devices, Android version is much faster than iOS version overall. Maybe I didn't do it in right way. Here is my code about searching and test result:

2014-11-02 22:24:57.471 mangabird[6220:390f] dmzj->3.181459
2014-11-02 22:24:58.236 mangabird[6220:390f] manhuadao->0.757238
2014-11-02 22:24:59.509 mangabird[6220:390f] 99comic->1.266808
2014-11-02 22:25:00.141 mangabird[6220:390f] jide->0.625673 (use MR_defaultContext, if using MR_contextWithStoreCoordinator, it will be more than 5s)
2014-11-02 22:25:00.723 mangabird[6220:390f] bengou->0.576320
2014-11-02 22:25:01.752 mangabird[6220:390f] manhua8->1.021073
NSMutableArray *temp = [NSMutableArray array];
NSPredicate *predicate = [NSPredicate predicateWithFormat:@"name contains[cd] %@ OR author contains[cd] %@ ",searchText,searchText];
NSArray *dbNames = [[AppDelegate sharedAppEntranceJson] websiteNamesByLan:[StaticParameters mangaLanguase]];
NSString *dbCurrent = [StaticParameters websiteInUse];
for (NSString *dbName in dbNames) {
     NSTimeInterval begin = [[NSDate date] timeIntervalSince1970];
     NSString *storeName = (NSString *)[StaticParameters getCoredataNameByWebsite:dbName];
     NSManagedObjectContext *localContext;
     if ([dbCurrent isEqualToString:dbName])
          localContext = [NSManagedObjectContext MR_defaultContext];
     else
          localContext = [NSManagedObjectContext MR_contextWithStoreCoordinator:[NSPersistentStoreCoordinator MR_coordinatorWithAutoMigratingSqliteStoreNamed:storeName]];
     NSArray *array = [Comic MR_findAllWithPredicate:predicate inContext:localContext];
     NSLog(@"%@->%f",dbName,[[NSDate date] timeIntervalSince1970] - begin);

    //remove duplicates
     array = [array valueForKeyPath:@"@distinctUnionOfObjects.self"];
     NSMutableArray *resTemp = [self searchItemsByComics:array website:dbName];
    [temp addObjectsFromArray:resTemp];
}
yorkwang commented 10 years ago

After optimizing, It takes less than 2s overall now on my new iPad iOS 8.1, but still can't meet requirements.

NSMutableArray *temp = [NSMutableArray array];
NSPredicate *predicate = [NSPredicate predicateWithFormat:@"name contains[c] %@ OR author contains[c] %@ ",searchText,searchText];
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
[fetchRequest setFetchBatchSize:10];
[fetchRequest setPredicate:predicate];
[fetchRequest setEntity:[Comic MR_entityDescription]];
NSArray *dbNames = [[AppDelegate sharedAppEntranceJson] websiteNamesByLan:[StaticParameters mangaLanguase]];
NSTimeInterval begin1 = [[NSDate date] timeIntervalSince1970];

for (NSString *dbName in dbNames) {
     NSTimeInterval begin = [[NSDate date] timeIntervalSince1970];
     NSString *storeName = (NSString *)[StaticParameters getCoredataNameByWebsite:dbName];
     NSManagedObjectContext *localContext = [NSManagedObjectContext MR_contextWithStoreCoordinator:[NSPersistentStoreCoordinator MR_coordinatorWithSqliteStoreNamed:storeName]];

     NSArray *array = [localContext executeFetchRequest:fetchRequest error:nil];

     NSLog(@"%@->%f",dbName,[[NSDate date] timeIntervalSince1970] - begin);

    //remove duplicates
     array = [array valueForKeyPath:@"@distinctUnionOfObjects.self"];
     NSMutableArray *resTemp = [self searchItemsByComics:array website:dbName];
    [temp addObjectsFromArray:resTemp];
}
NSLog(@"all->%f",[[NSDate date] timeIntervalSince1970] - begin1);

Then, I read a post here: http://www.objc.io/issue-4/SQLite-instead-of-core-data.html. It seems coredata's performance is not as good as Sqlite when dealing with 10K+ entities. In my case, there are only some plain-text properties in data model, Sqlite should be a better choice. But still, magicalRecord is an awesome library I'v ever used. Actually, the problem is nothing with magicalRecord, it's about coredata VS sqlite. We should choose one of them or both of them in right way.

tonyarnold commented 10 years ago

I'm not entirely sure what it is that you're doing above, so I can't provide too much specific advice, but you'd be much better served constructing a fetch request that fetches everything you need in one go rather than executing within a for loop, and adding some indexes to commonly fetched/sorted attributes of your entities.

Two things I'll point out about the objc.io article:

  1. Core Data is never going to be as fast as raw SQLite, but that's not why you use it. If bare metal numbers are your concern, you shouldn't be using high-level abstractions like Core Data.
  2. The numbers that Brent presents don't really apply for Core Data's standard use case. Brent was trying to mark 10k entities as being read/unread in a single go — that was definitely a bottleneck in previous version, however Apple added new API for in 10.10 and iOS 8.0. Bulk data has never been Core Data's forte.

Happy to provide further advice if you want — but I really need to do some performance tests around these things as well. I've never actually dealt with an iOS app that has tens of thousands of rows of data in normal use.

yorkwang commented 10 years ago

Yes, core data has many advantages. My app has several PersistentStores, one fetch for one PersistentStore, I don't know if there is any better way.

tonyarnold commented 10 years ago

You could investigate alternatives like Realm and YapDatabase. I know Realm provides support for opening multiple realm database files without performance concerns.

ryanjm commented 10 years ago

@yorkwang - you might try posting your question / code to Stackoverflow (SO) since you'll get more advice on CoreData specific improvements. I'm not a pro with CoreData but depending on how big your array is:

array = [array valueForKeyPath:@"@distinctUnionOfObjects.self"];
NSMutableArray *resTemp = [self searchItemsByComics:array website:dbName];
[temp addObjectsFromArray:resTemp];

These steps might take awhile just by themselves. I'd recommend testing how fast the code is without them and make sure it is the fetch that is the slow part.

I'm not sure how @tonyarnold wants to handle it, but I'd recommend posting to SO, then posting your link to that question here, and then closing the issue. As you said, this probably isn't a problem with MagicalRecord. This will just make it easier for the core team to manage open tickets.

yorkwang commented 10 years ago

@tonyarnold @ryanjm Thanks for your suggestions! Usually the search result array's length is less than 20. Personally, I don't think it is a bug or performance issue of MagicalRecord. Just my app's way of handle databases is special, 6 or more PersistentStores, every one has 10K+ item, with fuzzy search on two properties. I have to replace core data with sqlite, now the overall search time is about 0.4~0.5s on my new iPad.

ghost commented 8 years ago

Sounds like this bug can get closed?