simolus3 / drift

Drift is an easy to use, reactive, typesafe persistence library for Dart & Flutter.
https://drift.simonbinder.eu/
MIT License
2.65k stars 372 forks source link

sync between devices Or Cloud. Approahces ? #136

Closed joeblew99 closed 4 years ago

joeblew99 commented 5 years ago

With a client side database it raises the prospect of a User having many devices ( web, Desktop, mobiles) and how best to approach the pattern for data synchronisation.

Event Log One way is to maintain an Event Log for local mutations on each Client. This contains a JSON of the Mutation. The Event Log entries are then sent to a Cloud Server and stored against the device ID, and then deleted on the client. On the clud Server the Events are NEVER deleted. This is because when you add a new device you need to get the Events and replay them to it. Your other devices then get a SSE event push of the events and update the DB BAD

Etcd could be a very nice and simple option for the Cloud server. There is a SSE system for Server and Flutter that uses this. https://github.com/wallforfry/dart_mercure https://github.com/dunglas/mercure

This is just one approach to how to skin this cat to of course. Very curious to see what others think.


Its important to note that this is NOT trying to do transactions between different users. THats a whole different area.

joeblew99 commented 5 years ago

What about if a User is on an old version and doing mutations offline ?

When they go online, the events ( stamped with version number) will go back to the Sync Server. But their other devices is at a version ahead and so when it gets the event it tries to update the local DB and fails because the schema is different.

Maybe Protobufs will help ? Protocol buffers might help because the fields are numbered and so if a client gets protobuf data with a field in it that it does not have the protobuf type for it just ignores that field. It would then update the database cleanly. However when it upgrades to the next version that does have that Protobug type and Table, it will have that record without the field from before.

Time: I am discounting the problem of time differences between devices because a user is assumed to only be using one device at a time, and so the system should never get a clash.

Changing the same record on many devices: In this case we have no choice but to use the last in winds pattern.

simolus3 commented 5 years ago

Some thoughts I have on this:

On the clud Server the Events are NEVER deleted. This is because when you add a new device you need to get the Events and replay them to it.

I've seen event logs being used for cross-device synchronization. But storing the entire history on the server and replaying it on new devices can get very expensive for long-time or very active users, who can easily have hundreds of thousands of events. Also, some events of the past might not be relevant anymore: Say a user does something like

  1. create a file
  2. make a bunch (or a lot) of edits in that file
  3. delete that file

If another client missed all three steps (which could contain hundreds of events), that's not a problem at all because all these events, when combined, have no effect on any state. I think the most common way event logs are used is that the server does not store any events, ever (or at least it doesn't expose them). Instead, it only stores the current state. When a client connects, it sends all the local edits to the server, which takes care of updating the state. The client can then grab the fresh state snapshot from the server. This can solve some problems on how clients are supposed to deal with outdated event logs / protocol changes.

Of course, different approaches will work to a different degree based on the actual use case. In a chat application, it might be desirable to always have all messages sync across all devices, even very old ones, so it makes more sense to actually store all event logs. On a notes app, being aware of older snapshots might be less important, which can justify not storing the entire event log,

joeblew99 commented 5 years ago

Thanks for the discussion.

During sync what do you imagine is a snapshot ? You said that the client sends all their events that I presume are transacted with the mysql and mysql gives the client a snapshot.

Really appreciate this discussion as I am playing with the FFI stuff and the mysql FFI stuff. It's quite a compelling proposition in terms of reducing complexity in the stack

simolus3 commented 5 years ago

During sync what do you imagine is a snapshot

I would imagine some redux-like model where the client sends all their actions (or events) to the server. The server can then fold them into state (e.g. f(old_state, event) = newState). The server could probably get away with only keeping the current state (or snapshot) in most cases.

For example, let's imagine a scenario where we're managing users, and we'd like to store their name. In that case, the current name would be a snapshot (which I assume the server could make available via some GET /user/name call). So handling name changes could just look like

// server side logic:
Future handleClientSync(List<Event> eventsFromClient) async {
  final currentName = serverDb.loadName(clientId);
  final nextName = eventsFromClient.fold(currentName, (name, event) => event.name);
  await serverDb.changeName(clientId, currentName);
}

So the server doesn't store the events in its database, only the current name, which we can call a snapshot.

joeblew99 commented 5 years ago

Ok i think i get what you mean now.

This looks like it would work for a sync when the User has their data on the server. If two users change the same data i see an issue in that user 1 has synced, but then user 2 still have old data. If they are both changing the same data then there is a clash ?

On Fri, Oct 18, 2019 at 7:36 PM Simon Binder notifications@github.com wrote:

During sync what do you imagine is a snapshot

I would imagine some redux-like model where the client sends all their actions (or events) to the server. The server can then fold them into state (e.g. f(old_state, event) = newState). The server could probably get away with keeping the current state (or snapshot) in most cases.

For example, let's imagine a scenario where we're managing users, and we'd like to store their name. In that case, the current name would be a snapshot (which I assume the server could make available via some GET /user/name call). So handling name changes could just look like

// server side logic: Future handleClientSync(List eventsFromClient) async { final currentName = serverDb.loadName(clientId); final nextName = eventsFromClient.fold(currentName, (name, event) => event.name); await serverDb.changeName(clientId, currentName); }

So the server doesn't store the events in its database, only the current name, which we can call a snapshot.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

simolus3 commented 5 years ago

Yeah, it would probably be harder to implement a proper conflict resolution algorithm if the server only has the current state. But solving those conflicts is always tough and highly dependent on the actual problem domain. Even Google Docs sometimes asks you to just pick a version when it can't merge changes together, and it's very good at sync in general. So while conflict resolution in offline sync is a very challenging problem to solve, in most cases it's "good enough" to just naively apply those changes that the server receives first. If there's a conflict, it might be acceptable to just reject the data and ask the user to re-do their changes.

listepo commented 3 years ago

@simolus3 any plans to add sync like https://nozbe.github.io/WatermelonDB/Advanced/Sync.html ?

simolus3 commented 3 years ago

No, it would be much more complex since sqlite3 doesn't really have a synchronization protocol.

davidmartos96 commented 3 years ago

@simolus3

No, it would be much more complex since sqlite3 doesn't really have a synchronization protocol.

What do you mean by that? Doesn't WatermelonDB use sqlite under the hood? Looking at the details of their sync implementation (https://nozbe.github.io/WatermelonDB/Implementation/SyncImpl.html#sync-procedure) I'd say all steps would be doable with either moor or sqflite primitives. The only part I'm not that sure would be the write only locks, but maybe it could be done with WAL mode or "BEGIN IMMEDIATE" transactions.

I don't know how good the general sync solution of WatermelonDB is, as I don't have prior experience with it, but it looks popular. Have you used it before @listepo ?

CodingSoot commented 2 years ago

Looking at the details of their sync implementation (https://nozbe.github.io/WatermelonDB/Implementation/SyncImpl.html#sync-procedure) I'd say all steps would be doable with either moor or sqflite primitives.

I think drift is one of the most suitable dart packages to include a sync mechanism like the one WatermelonDB provides. It doesn't have to suit all usecases, but only be good enough for the most common ones. WatermelonDB did an amazing job at that.