Question: Comparison between GRDB and Core Data concurrency

lastcc commented 6 years ago

I have used Core Data for sometime and find it quite unreliable.

The new features they introduced, which was to address issues, actually causes new issues that can't be fixed by app developers. on iOS 11, NSQueryGeneration can cause data loss if used together with NSPersistentHistoryTracking.

And there are other bugs that causes data loss like binaryData type with external storage enabled may be nil under iOS 12 beta.

My question being, I want to try new things like GRDB, but I just read the doc, and it says something contrary to what NSQueryGeneration was aimed to solve.
"The database file is the single source of truth." ... " each application thread has its own version of a Realm database"

I thought NSQueryGeneration was introduced last year to defer the update of context. Because under multi-threaded environment, reading can be inconsistent if other thread is writing new content. So the updating (to the reader context) must be deferred until at least next runloop.

How GRDB solve the Query Generation Problem?

groue commented 6 years ago

Hello @lastcc,

It is difficult to answer your question, because the subject is pretty complex. SQLite concurrency is not trivial. Wrapped inside Core Data concurrency abstractions, it gets even less trivial.

On top of that, GRDB documentation is a work-in-progress. I hope I'll be able to clarify some imprecisions here.

So let's start with the beginning: GRDB has no managed object context. Instead, it has database access methods (the write and read methods of database queues and pools, which wrap your database statements).

Those are the fundamental isolation units of Core Data and GRDB.

MOCs are "isolated", and this means that a MOC can't alter objects in another MOC. MOCs access a database that doesn't change unless they modify it themselves, or they listen to NSManagedObjectContextDidSave notifications (directly or indirectly via other high-level Core Data apis - as far as I know).

GRDB database access methods are even more strictly isolated. Reads are guaranteed with a totally immutable view of the database. Writes are guaranteed with exclusive write access.

This is easy to say, but less easy to grasp all the consequences.

Because under multi-threaded environment, reading can be inconsistent if other thread is writing new content.

If Core Data allows inconsistent reads, I'd call this a huge first-class bug. I'm not sure it does.

GRDB totally prevents inconsistent reads. This is one of the core GRDB design decisions. When you run several database statements one after the other, no other thread can mess in. It's impossible. Super tested. No bug found yet. I'm super confident, because I trust libDispatch, and SQLite snapshot isolation.

This guarantee only holds inside a database access method.

On top of that, the "single source of truth" applies inside a database write method:

try dbQueue/dbPool.read { db in
    // Guaranteed immutable and consistent view of the database here.
}

try dbQueue/dbPool.write { db in
    // Exclusive access to the database.
    // No other thread can write.
}

However, all data you read from the database should be considered obsolete as soon as you extract it from a database access method. This is because other threads can write in the databaase between the end of the database access method, and the moment you use the extracted data:

let value = try dbQueue/dbPool.write/read { db in
    return ...
}
print("Value used to be \(value). It now may be anything else.")

The second core GRDB design decision is that reading obsolete data is not a problem. It is not a problem, because you can't call a problem something which can't be avoided.

You don't want to restrict access to the database on the main thread only, because it would be terribly inefficient. Well, it means that while you are drawing database values on screen, other threads are potentially modifying the database.

Dealing with obsolete data is part of the application job.

But is it difficult? I'm not sure. This is the third core GRDB design decision.

It is difficult when you use Core Data, because you have to sweat your conflict strategies. And it's difficult to test.

With GRDB, you also have to deal with conflicts. But it's generally easier, because you can deal with conflicts locally.

var player: Player

@IBAction func submit() {
    player.name = nameField.text

    // Let's save player by updating the full player row in the database.
    // However our player may have been deleted by some other thread.
    // In this case, insert the player again.
    try dbQueue.write { db in
        // exclusive write access
        do {
            try player.update(db)
        } catch PersistenceError.recordNotFound {
            // Player was not foud
            try player.insert(db)
        }
    }
}

Sometimes it just doesn't need to be more complex.

How GRDB solve the Query Generation Problem?

With three core design decisions:

No inconsistent reads, ever.
Obsolete data is normal, and applications have to deal with it.
Conflicts are easier to deal with locally.

For more information, see the Concurrency guide.

lastcc commented 6 years ago

Thank you very much for the reply!

Deal with conflicts locally is very smart. Awesome work.

According to the objc.io CoreData book:

“If a context performs multiple subsequent read operations (fulfilling faults or performing fetch requests), another context might save changes to the same store at the same time. The results of the first context’s read operations could then potentially be inconsistent. Query generations, introduced with iOS 10/macOS 10.12, address this problem. They allow a context to use a store as if it was unchanged, regardless of other contexts changing the store.”

groue commented 6 years ago

Thank you very much for the reply!

Deal with conflicts locally is very smart. Awesome work.

Thanks @lastcc!

According to the objc.io CoreData book:

“If a context performs multiple subsequent read operations (fulfilling faults or performing fetch requests), another context might save changes to the same store at the same time. The results of the first context’s read operations could then potentially be inconsistent. Query generations, introduced with iOS 10/macOS 10.12, address this problem. They allow a context to use a store as if it was unchanged, regardless of other contexts changing the store.”

Wow :dizzy_face: I really did't expect this: it's been years I've taken my last bite of conflict errors and crashing faults. I feel lucky I didn't meet (or notice) inconsistent reads...

You'll have a better reception here :bowtie:. Happy GRDB!

groue / GRDB.swift

Question: Comparison between GRDB and Core Data concurrency #405