calvinmetcalf / crypto-pouch

plugin for encrypted pouchdb/couchdb databases
MIT License
243 stars 43 forks source link

Replicate to remote CouchDB #44

Closed chatzipan closed 3 years ago

chatzipan commented 7 years ago

Hi, I've been struggling since some days to replicate data to a remote couchDB after I've encrypted my local Pouch with crypto-pouch.

First I thought it was not working because I was using a filter, as mentioned in issue 37. I removed the filter and still couldn't make it work.

After testing, I could finally achieve replication after applying the crypto function to my remote database as well, using the same password.

If that's the way it works, could you please make it more clear in your documentation?

Then, I noticed that this way, the data in my remote Couch were been inserted encrypted after the replication.

Is there a way to keep encrypted data on the client, but sent it unencrypted to the remote DB?

Many thanks in advance.

calvinmetcalf commented 7 years ago

um it's supposed to send it unencrypted, how have you setup the replication

chatzipan commented 7 years ago

I'm doing

var db = new PouchDB('todos');
var remoteCouch = new PouchDB('http://127.0.0.1:5984/todos');
db.crypto("foo");

var opts = {live: true};
db.replicate.to(remoteCouch, opts);
db.replicate.from(remoteCouch, opts);

This will not send changes to my couch.

However if I do

remoteCouch.crypto("foo"); right after db.crypto("foo"); replication will work, but it will send decrypted docs on my couchDB.

chatzipan commented 7 years ago

I don't know it that helps, but I noticed that if I don't use crypto-pouch, so if I don't do

db.crypto("foo");

then, when changing a pouch doc, pouch does a

GET at http://127.0.0.1:5984/todos/_local/hashXXX

which replies with 200, and which leads to new

GET at http://127.0.0.1:5984/todos/_changes?timeout=25000&style=all_docs&feed=longpoll&heartbeat=10000&since=37&limit=99 and a subsequent PUT at /todos/_local/newHashXXX

which leads to my data being replicated in my CouchDB.

However, when using db.crypto("foo");, and after changing a pouch doc, the initial GET at todos/local/hashXXX receives a 304 Not Modified response, and therefore does not lead to a new PUT request.

Any idea why this is happening?

chatzipan commented 7 years ago

Ok, I think I narrowed it down a bit further.

Replication is successful when creating new documents on the client.

However the problem appears in the case of a fresh start with an empty PouchDB, and in case the remote CouchDB already has data. These data will be replicated on the client, but any further editing on the client will not result in a replication, unless the call to db.crypto("password"); takes place after the initial data replication from the CouchDB has finished.

Since using live replication, pouch does not emit a complete event, I guess it requires some manual work to figure out when replication has finished, by reading last_seq on the change event.

calvinmetcalf commented 7 years ago

this is likely an issue with transform-pouch which we use under the hood

chatzipan commented 7 years ago

So, after some further investigation, I used the following code to encrypt my Pouch after the initial replication:

  var db = new PouchDB('todos');
  var remoteCouch = new PouchDB('http://127.0.0.1:5984/todos');

  db.replicate.from(remoteCouch).then(function (result) {

      // encrypt
      db.crypto("foo");

      // start live sync
      db.sync(remoteCouch,  {live: true});

  }).catch(function (err) {
      console.log(err);
  });

This works fine, and all data from my CouchDB get in my Pouch on page reload, then get encrypted, and in case I change them in my browser, the get back to my CouchDB unencrypted.

However, if I keep the app running, and at the same time I create a new doc direct in my Couch, it gets replicated in my Pouch, but any editing on the client will not replicate back to CouchDB.

chatzipan commented 7 years ago

In might have to do with this transform-pouch issue

jackkleeman commented 7 years ago

I have a solution for this; have two pouchdb instances, one with crypto pouch loaded as a plugin, one without. Do all your document manipulation in your crypto instance, but setup replication in your blank instance. It's a little hacky, but was a 2-3 line change for me that works out the box. Data is sent to the server encrypted, and you could then set up the couch server to be able to decrypt given the same password.

I am using this for a zero knowledge application where the server doesn't have access to any user data, its just a dumb store.

fredguth commented 6 years ago

@jackkleeman your approach seems interesting. Could you elaborate? This is worth a blog post. 😬

jackkleeman commented 6 years ago

@fredguth what a blast from the past! Here is a gist showing the general idea: https://gist.github.com/jackkleeman/8070687fd8527fcef66676e3c0af0083

One of the PouchDB instances has no idea the db is encrypted, it just relays the documents as it sees them. The other instance is the one used for actual db operations. They both use the same db name, which actually works, they don't need to be told to sync to each other, interestingly. At least, this worked a year ago!

If you want to post about it, please do!

pemdora commented 6 years ago

Hi, we had the same issue when encrypting a local PouchDB that was sync with a remote url :

Original code

this.localdb = new PouchDB('database');
this.localdb.crypto('password');
var remoteURL = 'http://127.0.0.1:5984/database'
let options = { .... }
this.localdb.sync(remoteURL, options)

What we did to solve this issue was to create 2 DBs, one encrypted DB that replicates data from Couchdb into a local Pouchdb (pull changes). One not encrypted remote DB but with no data stored that will push changes.

Adapted code

this.localdb = new PouchDB('database');
this.localdb.crypto('password');
var remoteURL = 'http://127.0.0.1:5984/database';
let options = { .... }
this.localdb.replicate.from(remoteURL, options);
this.remoteDB = new PouchDB(remoteURL, options);

All get/query requests are done with the localdb and all put/post requests are done with remoteDB.

Also we had to modify crypto pouch index.js file, to add '_revisions' to ignore variable (line 18 var ignore = ['_id', '_rev', '_deleted','_revisions']) otherwise localDB will not handle pull changes.

mikeymckay commented 5 years ago

This seems like an ancient (and critical!) error without much hope of ever getting fixed, but I just thought I would document my findings:

Replicate a document from a remote couch to a local encrypted pouch, make a change to that document, put it to the local pouch and then replicate it back to the remote couch. The replication result will show that 1 file has been updated:

{ok: true, start_time: "2019-01-09T18:38:43.096Z", docs_read: 1, docs_written: 1, doc_write_failures: 0, …}

But in reality, the remote couch never gets the updated doc.

If you do the same, but instead of using a remote couch as the starting point, and just use an unencrypted local pouch, then the updates seem to replicate fine.

Bottom line: If you use crypto-pouch it will fail if you try and edit documents and replicate them to a remote couch.

mikeymckay commented 5 years ago

I'm still fighting this one. Here's the simplest test I can come up with now. No need to even use a remote couchdb, can be demonstrated with just local pouchdbs

PouchDB = require "pouchdb"
crypto = require "crypto-pouch"
PouchDB.plugin(crypto)

localDb = null
localEncrypted = null

resetDatabases = =>
  localDb = new PouchDB("localDb")
  await localDb.destroy()
  localDb = new PouchDB("localDb")

  localEncrypted = new PouchDB("localEncrypted")
  await localEncrypted.destroy()
  localEncrypted = new PouchDB("localEncrypted")
  localEncrypted.crypto("password")

  console.log "DBs reset"

testNoCloudDB = =>
  await localDb.put
    _id: "doc1"
    foo: "Bar"

  doc1 = await localDb.get("doc1")
  doc1.a = "a1"
  await localDb.put(doc1)

  await localDb.replicate.to(localEncrypted)
  doc1 = await localEncrypted.get("doc1", revs_info: true)
  doc1.b1 = "b1" # THIS PROPERTY NEVER SHOWS UP
  await localEncrypted.put(doc1)
  console.log "After saving to a local encrypted db, do we have a b1 property?"
  console.log await localEncrypted.get("doc1", revs_info:true)

  console.log "Replicate local encrypted db to localDB"
  await localEncrypted.replicate.to(localDb)

  console.log "After replicating to a local unencrypted db, do we have a b1 property?"
  console.log await localDb.get("doc1")
  console.log "NOPE - THAT'S THE PROBLEM! If you remove the .crypto function call it all works as expected"

resetDatabases()
.then =>
  testNoCloudDB()
.catch (error) =>
  console.error error
rasgo-cc commented 4 years ago

1 years later, and one afternoon later in my case: https://github.com/paulsutherland/polyonic-secure-pouch/issues/12

I believe the issue here is the same: https://github.com/calvinmetcalf/crypto-pouch/blob/master/index.js#L158

In summary: don't blacklist the keys you don't want to encrypt, instead whitelist the keys you want to encrypt.

garbados commented 3 years ago

Looks like this is still an issue. You can reproduce it like this:

const assert = require('assert').strict
const PouchDB = require('pouchdb')
PouchDB.plugin(require('crypto-pouch'))

const NAME = '.test'
const PASSWORD = 'hello world'
const DOC = { _id: 'a', hello: 'world' }
const DDOC = {
  _id: '_design/test',
  views: {
    test: {
      map: function(doc) {
        emit(doc.hello)
      }.toString()
    }
  }
}

const db = new PouchDB(NAME)
const db2 = new PouchDB(NAME + '2')
db.crypto(PASSWORD).then(async () => {
  await db.put(DOC)
  await db.put(DDOC)
  await db.replicate.to(db2)
  const result = await db2.allDocs({ include_docs: true })
  assert.equal(result.rows.length, 2)
  console.log('ok')
}).catch((err) => {
  console.error(err)
}).then(async () => {
  await db.destroy()
  await db2.destroy()
})

You will see output like this:

[bad_request: Invalid rev format] {
  status: 400,
  error: true,
  result: {
    ok: false,
    start_time: '2021-08-07T00:12:05.257Z',
    docs_read: 2,
    docs_written: 0,
    doc_write_failures: 2,
    errors: [],
    status: 'aborting',
    end_time: '2021-08-07T00:12:05.269Z',
    last_seq: 0
  }
}
garbados commented 3 years ago

It looks like this happens because crypto-pouch encrypts payloads before a _rev is assigned, so that when it is decrypted for replication it does not have a _rev value, which breaks replication.

We will have to generate our own _rev values since we have to intercept the document prior to writing to disk.

jcoglan commented 3 years ago

@garbados Rather than generating our own rev values, I was wondering why we're discarding fields from the outgoing doc in outgoing(). Here we decrypt doc.payload and return the result to the application, and drop any other fields that doc has. This would include "meta" fields like _id, _rev, _revisions, _conflicts and so on. Would it make sense to preserve these in the outgoing() return value?

This also makes me wonder if there are other fields we should add to the IGNORE list -- is it the case that any "meta" fields beginning with _ should escape encryption, and only application-controlled fields should be encrypted?

garbados commented 3 years ago

I've put up a PR that patched the decrypted doc with ignored fields attached to the encrypted doc, like _rev. This fixes the issue. Thanks for talking it through with me @jcoglan