Be offline first - Githubissues

almet commented 9 years ago

I was viewing a video about being offline first, talking about how Hoodie works and what's the rationale behind their design and I was wondering how we wanted to handle syncing for daybed.

Daybed, for most of it, is doing validation of the data given a schema you define beforehand. One thing we want to add, though, is a way to sync between clients. And one thing I would like to do is to be offline first. In other words, I would like to be able to define some data locally, validate and store it, and then sync it really easily.

On key point of the conference Alex gave (the video I linked) is the fact that syncing is hard, and that we probably don't want to do that ourselves. To avoid that with Hoodie, they use CouchDB and its replication mechanism.

I'm not sure how that works specifically in their case, and if that would be possible to use Hoodie database or PouchDB for instance, but I believe we would like to reuse something existing rather than starting a new syncing engine (because if something exists that would make sense to reuse it).

One thing that came up is that we don't want to trust clients. But that's something really hard to accomplish in our case it seems.

How do we do syncing if we cannot trust the clients? Should we re-validate all the fields that the client sent us?

I would like to use this issue to discuss how could we do offline first for daybed. It's all about the API we want for daybed.js (do we want to reuse something already existing? Do we want to roll our own?) and how do we do the syncing part.

Let's discuss!

Natim commented 9 years ago

I agree with you that Being offline first should be the way to go. Also the way Daybed works is that validation will takes place on save on the server. We could add a mechanism to let the client app validates beforehands but we will have to handle this validation on server side anyway. (because we don't trust the client.)

This comes for free with PouchDB and CouchDB but we don't have direct access to CouchDB from Daybed.

Also we may want to do validation on sync so we will need to proxify storage calls.

Can we make daybed syncable with CouchDB/PouchDB and add a validation models on it?

Natim commented 9 years ago

I've read http://guide.couchdb.org/editions/1/fr/consistency.html#study and it seems to me that we "just" need to add a _rev version to each document (model/record) and that this _rev should be checked before making the update. We will then receive a new _rev for the new document.

In that case we will never override anything because the storage backend will fail if the current _rev is different from the one provided by the update.

almet commented 9 years ago

Some more thoughts about that:

1) Doing validation on the daybed side using CouchDB syncing mechanism doesn't seem to be an option, because the atom for CouchDB sync is the database. We cannot sync a whole database this way because they're not to be share with all users.

2) We eventually could trust clients, but only if they have the right permissions. For instance, that's okay to let someone clutter the database if they have the right to do so.

3) If we implement syncing, we should have a look at syncing mechanisms that already exist on the python / js world rather than implementing our own.

Natim commented 9 years ago

I am ok to implements our own if it stay as simple as the CouchDB one.

almet commented 9 years ago

One thing we discussed with @natim while cycling (yeah, we do that), is the ability to roll our own syncing engine, that would not be tied to Daybed as it is.

This means the scope of the project would remain small: just deal with validation of data and permissions, and we would have a syncing engine atop of it that would allow us to sync data.

The pre-requesite for this to be a valid schema is that one doesn't need to know the data structure in advance to be able to do syncing. I believe that's true because of how couchdb does (there is no schema so it cannot be known!)

Doing this allows us to look at already existing syncing mechanisms, and would mean we do a post-syncing task, to filter the data we may not want (because it doesn't validate for instance).

Details to come ;)

almet commented 9 years ago

Loopback does their own replication themselves.

The documentation isn't really clear as to how this is done in the backend, but we could benefit from some of the investigation they did.

leplatrem commented 9 years ago

If conflict resolution is a thick topic to be managed by the server, we could at least :

compute a hash on records
require If-Match with ETag on PATCH
raise 412 if ETag does not match the one in database

See http://python-eve.org/features.html#data-integrity-and-concurrency-control

spiral-project / daybed

Be offline first #184