drewmccormack / ensembles

A synchronization framework for Core Data.
MIT License
1.63k stars 131 forks source link

Too slow for large data set #129

Closed ylin closed 10 years ago

ylin commented 10 years ago

First, some background. Mine is an accounting app, and since user can keep accounts with transactions and categories that span multiple years, a typical database, could have thousands of records.

For the database I am testing right now, there is a transaction table (ztransaction) with 7,189 rows. And two companion tables (to keep category information), each contain roughly 7k rows. So the total database has roughly 21k rows, plus rows from few other smaller tables.

The sqlite database on disk is 2MB in size.

I leeched the database in the Mac version of my app, running on MacBook Pro, and leechPersistentStoreWithCompletion took 12 minutes to return with success state. During that time, CPU usage was at around 95%:

screen shot 2014-04-05 at 5 05 09 pm

One thing to note, is I am running this on a relatively slow broadband connection (3mbps down, 125kbps up), but it is something a user may be using.

My worry is, if it took 12 minutes on my MacBook Pro, I think it'll take way too long on something like an iPhone 4S.

But this is actually the smaller problem. After leechPersistentStoreWithCompletion returned, I saw there began to be iCloud transfer activity. The Xcode monitor looks like this:

screen shot 2014-04-05 at 5 56 32 pm

This transfer activity persisted for over an hour, and is still taking place, I'm not sure when or whether it will finish.

The current state of ensemble's ubiquity container seems to indicate much of the data hasn't been transferred yet:

screen shot 2014-04-05 at 6 13 58 pm

Moreover, I am a bit concerned that the baseline file is 48mb in size, much bigger than the sync data generated by iCloud Core Data.

I am going to let this run for a while longer.

Is there something I'm doing wrong?

ylin commented 10 years ago

Ok, it seems transfer has stopped:

screen shot 2014-04-05 at 9 12 50 pm

Not sure exactly when it stopped, but could be around 2 hours.

This is the final content of the ensembles ubiquitous folder:

screen shot 2014-04-05 at 9 12 59 pm

Surprisingly small number of files, I'm assuming it's all in the baseline. 48MB baseline file is definitely too big of a file though. Could be why the transfer took so long on my 100kbps upstream ADSL connection.

drewmccormack commented 10 years ago

It sounds like your problem is too big for ensembles at the moment, especially if you want to use iOS.

I suspect you may have problems with nearly any sync solution though. You may have to just accept that the first sync will take a while.

I am about to start profiling and optimizing the framework. I have been concentrating on just making sure that it works solidly.

Drew

ylin commented 10 years ago

Ok thanks for the clarification. If my problem is too big due to the number of records, what is the size of database — in terms of number of records or other metric — that ensembles is designed to handle?

iCloud Core Data actually handles first sync of my database pretty well, even on iOS, so I may just have to fall back to it despite it’s existing problems (slow update on multiple device update, lack of support for large to-many relationship, and occasional inconsistency in sync). Hopefully this years WWDC will introduce a version that is suitable for general release.

On Apr 6, 2014, at 2:32 AM, Drew McCormack notifications@github.com wrote:

It sounds like your problem is too big for ensembles at the moment, especially if you want to use iOS.

I suspect you may have problems with nearly any sync solution though. You may have to just accept that the first sync will take a while.

I am about to start profiling and optimizing the framework. I have been concentrating on just making sure that it works solidly.

Drew

— Reply to this email directly or view it on GitHub.

drewmccormack commented 10 years ago

Core Data + iCloud doesn't actually import your data on first sync, so I would think that would be fast.

I don't have absolute numbers on how big is manageable. I will be profiling and optimizing that in the coming weeks.

Ensembles is very young. I've aimed to make it robust initially, rather than efficient. It will improve. Core Data + iCloud is around 3 years old now.

ylin commented 10 years ago

Ok, it’d be nice if you publish the profiling result somewhere so users would know the problem set that Ensembles aims to solve. Will keep an eye on Ensembles for future development.

On Apr 6, 2014, at 2:45 AM, Drew McCormack notifications@github.com wrote:

Core Data + iCloud doesn't actually import your data on first sync, so I would think that would be fast.

I don't have absolute numbers on how big is manageable. I will be profiling and optimizing that in the coming weeks.

Ensembles is very young. I've aimed to make it robust initially, rather than efficient. It will improve. Core Data + iCloud is around 3 years old now.

— Reply to this email directly or view it on GitHub.