dexie / Dexie.js

A Minimalistic Wrapper for IndexedDB
https://dexie.org
Apache License 2.0
11.4k stars 640 forks source link

Dexie.Syncable for google drive/dropbox #545

Open cfilipov opened 7 years ago

cfilipov commented 7 years ago

I'm looking into implementing gdrive/dropbox sync for my app which uses Dexie. My initial thought was that Dexie.Syncable could be used to sync to a file, but after diving in a bit it's starting to look like perhaps that's not the right use of this protocol. The main problem is that multiple clients would need to sync to the same file, which it seems the sync protocol is not designed for. Any thoughts on how one might handle this?

nponiros commented 7 years ago

I'm using dexie syncable combined with a server (https://github.com/nponiros/sync_server) which actually saves the data in a file. Then I use git to distribute the file to other clients. I guess you could use dropbox instead of git to distribute the file. The big question is what would dropbox do if two servers write to the same file at the same time. Maybe the server must implement some sort of locking mechanism.

In the end dexie syncable needs a server implementing the ISyncProtocol. If you have concurrent read/writes then the server must deal with those.

If you give us more info about the data you are storing and why you want/need dropbox then maybe someone has a better idea on how to do it.

cfilipov commented 7 years ago

In my case I am avoiding having a server at all. The goal is to have the client sync directly to dropbox. This would be similar to how 1password syncs its database via dropbox. Multiple clients can share a single opvault.

More details: this is a single-user workout tracking app. In most cases there will only be one client making changes to the files on dropbox. However, like 1password, it's possible for the user to have the app open on multiple devices, so concurrent writes is still an issue (working out in my garage, logging sets on my phone but sometimes walking over to the laptop and doing so there).

I do have some affordances that make things easier: while it's possible that multiple clients can be open at the same time and both making changes, it's still expected that a single human is behind the changes, so there won't be a case where a record is simultaneously modified from multiple clients. I could make use of a simple timestamp to resolve conflicts and simply not handle the more complex cases.

It's starting to sound like I might want to implement this on top of Dexie.Observable rather than Dexie.Syncable.

nponiros commented 7 years ago

Yes I don't see a way to do what you want with dexie syncable. I also tried to use a timestamp in the past but that caused other issues and I found dexie syncable to be better for my case. I don't know the dropbox api, maybe that can help with syncing.

cfilipov commented 7 years ago

Indeed timestamps are problematic. Dropbox doesn't really help in any meaningful way, it's just shared file storage. What I ended up doing was using a vector clock to determine when the client's databases are out of sync with version vectors on each record to detect conflict, and some custom domain logic to resolve the conflicts (which should be rare based on the specific use cases).

I don't have any experience in distributed systems so I don't know if this is the the most correct way of doing what I want but it seems right. Unfortunately it's not a very general solution but it seems to work so far.

johannesjo commented 5 years ago

@cfilipov I got the exact same use case. I had a look at remoteStorage.js but an adapter directly interacting with the indexeddb seems like a much better approach for my use case. Theoretically the available data should be enough for some very basic conflict resolution if you split it up into small enough chunks and different files. Will have a look in the next couple of days. Maybe there is a way. Ideally I would like to achieve a drop in solution for easy syncing to dropbox, google drive and owncloud.

cfilipov commented 5 years ago

@johannesjo I ended up getting this working by implementing a dexie plug-in that used a version vector per-table and per-record and hooks to update accordingly. I saved the data to files on dropbox using sharded files similar to 1password's opvault design (this way I don't load the whole file each time I sync). I only open sourced the version vector part, the rest I wasn't really pleased with.

It was complete enough that it would sync but there were some bugs. I'm not actively working on that code so best I can do is dump it all in a gist if I manage to rip out any app-specific parts (it's mostly isolated, but not completely), but you're on your own to figure out how it all works if it's useful to you.

johannesjo commented 5 years ago

@cfilipov thanks for getting back to me! That's pretty helpful already.

I also would love to have a look at the code! :)

cfilipov commented 5 years ago

Here it is: https://gist.github.com/cfilipov/48338f23ce92047807585f750af04bc0

Many caveats/warnings:

Honestly, I would just use firebase. The latest firebase js client does offline sync but it's obviously not the same as using dropbox and having no back-end to pay for.

johannesjo commented 5 years ago

Thank you again! This a tremendously helpful and the code is also a lot cleaner than I expected :)

I agree with you that it might not be worth the trouble. On the other hand I very much like the idea that you have your own cloud drive that you can use for all the app data instead of being dependent on the apps server, which might shut down or which they might sell to advertising companies, etc. From a developer point of view it would be nice to have an easy way to offer cloud sync without having to host your own server.

remoteStorage.js is an interesting approach to tackle this problem and so is solid server, but the former looks a little outdated to me when it comes to the general architecture while the latter does not seem stable at all and I get the feeling that both of them will never get there.

Theoretically I would assume that there should be a way to make this in a simpler fashion if you just want some sync for the data of a single user. If you save it on your cloud drive in a simple tree structure mirroring the database structure and if you can use local and remote timestamps of the last update time for the most simple kind of conflict resolution. Add some caching for undone requests and that's it. But as always, it's probably much more complicated the closer you get.

Don't know. I've a little bit of time off. I definitely will play around a little bit more. If I happen to make some progress, I'll post it here.

terox commented 1 year ago

Hello @johannesjo I'm in the same trouble that you and @cfilipov. I need to store some sensitive data into a private cloud and keep in in sync with other devices.

Do you made some progress in that time? What was your finally implemented solution?

Thanks in advance!

johannesjo commented 1 year ago

@terox I ended up using my own solution: https://github.com/johannesjo/super-productivity/tree/master/src/app/core/persistence

terox commented 1 year ago

@johannesjo thank you so much! Very instructive