apache / couchdb

Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
https://couchdb.apache.org/
Apache License 2.0
6.29k stars 1.04k forks source link

BUG: Recreating deleted document can break replication when VDU function is active #1418

Open ondra-novak opened 6 years ago

ondra-novak commented 6 years ago

It is possible to recreate (i mean: to update a deleted document with new content which hasn't the flag _deleted set to true) a document without need to specify revision field (_rev). However this behaviour has been changed somehow in CouchDB 2+.

In CouchDB 1.X, one can recreate document with the correct revision. It is also possible to recreate document without the revision and in this case, the CouchDB 1.X uses the document's recent revision. The rest of the operation is similar to updating existing document, including how the VDU (validate_doc_update) is called.

Consider following declaration (formatted to achieve better reading)

"validate_doc_update": function(newDoc, oldDoc, userCtx) {

        log(["validation oldDoc, newDoc, userCtx", oldDoc, newDoc, userCtx]);
        function unchanged(field) {         
                     if (oldDoc && (field in oldDoc) && toJSON(oldDoc[field]) != toJSON(newDoc[field]))
                     throw({forbidden : "Field can't be changed: " + field});
        };
        unchanged("owner");

}" 

In normal operation, oldDoc contains old document, newDoc contains new document. If the document is being created, the function is called with oldDoc == null. For all other updates the function is called with oldDoc equal to the body of the document being updated.

In CouchDB 1+ this is also applied for deleted documents, when the document is recreated, regardless on whether it is recreated with _rev or without the oldDoc contains the body of the deleted document.

In CouchDB 2+ this is no longer true. When the document is being recreated, oldDoc is always null. This behaviour can break replication in situation, when the replication is not continuous and when the document is deleted and recreated between each replication round.

Expected Behavior

If the document is recreated, the oldDoc of VDU should contain the body of previously deleted document

Current Behavior

If the document is recreated, the oldDoc of VDU is set to null

Possible Solution

The recreating of the document is mistakenly handled as a creation of the new document. It should be handled as an update.

Steps to Reproduce (for bugs)

  1. Prepare a design document with VDU defined above
  2. Create two databases (db1, db2) and put that design document in the both of them
  3. In the first database (db1), create document with a field "owner". For example
    {
    "_id": "bug",
    "owner": "jack"
    }
  4. ensure, that owner cannot be changed.
  5. also ensure, that document cannot be deleted using DELETE (VDU blocks correctly)
  6. replicate db1 to db2
  7. in the first database (db1), delete the document "bug" by adding "_deleted":true to the document
  8. recreate document "bug" with other owner (this will work, because VDU is called with null)
    {
    "_id": "bug",
    "owner": "marie"
    }
  9. replicate db1 to db2
  10. Result: document is not replicated. There is also error message sitting in the LOG file
    Replicator: couldn't write document `bug`, revision `4-e0cc418137015d862ef83eb6d714b21c`, to target database `http://localhost:5984/db2/`. Error: `forbidden`, reason: `Field can't be changed: owner`.

    This bug is dangerous in a one way. There is no way to detect, that the document is being recreated so the creator can create a document which suddenly cannot be replicated. There is also no way how to prevent such situation, because the VDU cannot distinguish between creation and recreation

creation:

Log :: ["validation oldDoc, newDoc, userCtx",null,{"_id":"loc.test","owner":"jack"},{"db":"db1","name":"admin","roles":["_admin"]}]

recreation:

Log :: ["validation oldDoc, newDoc, userCtx",null,{"_id":"loc.test","owner":"marie"},{"db":"db1","name":"admin","roles":["_admin"]}]

Context

The main goal of the checking the owner in the VDU is to prevent user to steal a document owned by other user.

Your Environment

bravier commented 6 years ago

Hello,

We are facing a similar issue as the one described by @ondra-novak on the _users database.

Steps to reproduce

Let A and B be 2 CouchDB servers.

Expected behavior

The user document might be replicated from A to B and the number of documents on A/_users and B/_users might be equal to 2 (_design/_auth document + our user).

Current behavior

The document is not replicated and the number of documents on A/_users is equal to 2 but the number of documents on B/_users is equal to 1 (_design/_auth only).

Environment

This issue is quite impacting for us as it silently breaks replication and can lead to data loss.

Thanks in advance for helping us on this bug.

Do not hesitate to ask me for more infos if needed.

kocolosk commented 5 years ago

So I know this was filed a long time ago, but - what a fascinating bug report!

I understand what's happening in @ondra-novak 's initial report, but it's not clear to me that @bravier is experiencing the same issue. In Ondra's case the validation failure happens because the replicator skips over the deletion, which means the target sees the old version (before the deletion) and the new version, and their owner fields don't match.

In @bravier 's case there's a replication after every edit, so it's definitely a different situation. I'm wondering if in that situation oldDoc really is present in the replicated edit path, and because the deletion did not keep the name property in the tombstone it fails to validate when the re-created doc shows up as an extension of the edit path. It's almost the converse of @ondra-novak 's report. Worth double-checking.

At a higher level ... I'm concerned a bit about the reliance on the presence of oldDoc in the VDU when a document was created without specifying any revision. Yes, in that case CouchDB will look to extend an existing edit branch, but I always thought of this as a performance optimization that was not intended to have user-facing consequences. Clearly it does, and folks are relying on it for reasonable use cases.

Separately, at some point we blocked the ability to supply the _rev of a deleted leaf edit when every leaf of the document is deleted. It's possible that this happened in 39df1d5e78a3ffd. Oddly enough if some other branch of the document is not deleted we will let the operation against the deleted leaf go through. That seems like an unintentional inconsistency.

I don't have a great answer about how to proceed here, but figured I should write down a few details that I picked up because I happened to be looking at this part of the codebase for #1957 and @wohali pointed me to this ticket as an interesting edge case.

adrienverge commented 5 years ago

This is an interesting edge case indeed!

I also experience the bug on CouchDB 2. My understanding is that it happens in this particular case:

komorebi-san commented 4 years ago

I have got the same issue as well for the database.

Currently, i dont allow to delete but only remarked as "deleted". Haha