Closed episage closed 6 years ago
@episage it looks like you're creating conflicts with the SG bulk_docs API, and then you're surprised to get conflicts back from the SG changes API. If you don't want conficts back on the SG changes API, then why are you creating them?
That's my interpretation on what I'm seeing based on:
docs.push({
_id: `[docX.mouse]-${sameId}`,
_rev: `3-A`,
sound: 'moom',
random: `(${counter++})`,
});
docs.push({
_id: `[docX.mouse]-${sameId}`,
_rev: `3-B`,
sound: 'pi',
random: `(${counter++})`,
});
// create conflicts
sourceAdapter.docsBulkWrite(docs, false, (error, writeResults) => {
callback(error, writeResults);
});
Here it looks like you are creating two conflicting revisions (3-A, 3-B) for the [docX.mouse]-${sameId}
doc. I'm assuming that sourceAdapter.docsBulkWrite()
eventually calls down to the _bulk_docs
SG API endpoint with new_edits=false
to indicate that SG should allow conflicts? (side note: the docs might need some improvement on that parameter, and so I filed a docs ticket)
If you can point to the code that is called from sourceAdapter.docsBulkWrite()
, that might be helpful. Basically I'd like to confirm the value of the new_edits
parameter you're passing to the _bulk_docs
SG API endpoint.
Also, taking a step back, if you could give a high level description of what you're trying to accomplish (and avoid), it might help in terms of recommending a best approach.
Generally speaking, you want to avoid creating conflicting revisions in Sync Gateway, and so you'd never really want to purposefully insert conflicting revisions (if that's what you're actually doing).
@tleyden I want to support offline-first functionality in my app.
The use case: Client A was offline for 1 hour.
[docX.mouse]-${sameId}
[docX.mouse]-${sameId}
is completely new, with sameId equal to some random UUID[docX.mouse]-${sameId}
multiple times[docX.mouse]-${sameId}
has multiple revisions in his database[docX.mouse]-${sameId}
to SGW[docX.mouse]-${sameId}
cannot happen because the [docX.mouse]-${sameId}
is completely new (with UUID), nobody knew of it's existence beforeHere is a better example:
const docs = [];
var sameId = generateUuid();
// this is the initial document
// client is OFFLINE
docs.push({
_id: `[docX.mouse]-${sameId}`,
_rev: `1-A1`,
sound: 'squeek',
});
// later on client realized that the mouse should do "meow" sound
docs.push({
_id: `[docX.mouse]-${sameId}`,
_rev: `2-A2`,
sound: 'meow',
});
// later on client thought that mouses do "roar" sounds
docs.push({
_id: `[docX.mouse]-${sameId}`,
_rev: `3-A3`,
sound: 'roar',
});
// client went ONLINE, thus the replication executed docsBulkWrite
// I expect no conflicts with the mouse document
sourceAdapter.docsBulkWrite(docs, false, (error, writeResults) => {
callback(error, writeResults);
});
Here is the code of the adapter docsBulkWrite:
function docsBulkWrite(docsArray, newEdits, callback) {
const bulkDocsBody = {
docs: docsArray,
new_edits: !!newEdits,
all_or_nothing: false,
};
client.database.post_bulk_docs(
databaseName,
bulkDocsBody,
(networkError, isSuccess, data, http) => {
if (networkError) {
return callback({ networkError, http });
}
if (!isSuccess) {
return callback({ data, http });
}
const responsesByIdAndRev = data.reduce((acc, workingResponse) => {
const id = workingResponse.id;
if (typeof acc[id] === 'undefined') {
acc[id] = {};
}
if (workingResponse.rev) {
acc[id][workingResponse.rev] = workingResponse;
} else {
if (typeof acc[id].unknown === 'undefined') {
acc[id].unknown = [];
}
acc[id].unknown.push(workingResponse);
}
return acc;
}, {});
let writeResults;
if (bulkDocsBody.new_edits === false) {
// response doesnt contain 'ok' nut contains id & rev
writeResults = docsArray.map((doc) => {
var id = doc._id;
var rev = doc._rev;
const responsesByRev = responsesByIdAndRev[id];
const workingResponse = responsesByRev[rev]
? responsesByRev[rev]
: responsesByRev.unknown.length > 0
? responsesByRev.unknown.shift()
: new Error(`Missing response for ${id}/${rev}`);
if (!workingResponse) {
// assume error when no response
return new WriteResult(
WriteStatus.ERROR,
doc,
doc._id,
doc._rev,
null,
);
} else if (workingResponse.ok) {
return new WriteResult(
WriteStatus.OK,
doc,
doc._id,
doc._rev,
null,
);
} else if (
workingResponse.ok === undefined &&
workingResponse.id &&
workingResponse.rev
) {
return new WriteResult(
WriteStatus.OK,
doc,
doc._id,
doc._rev,
null,
);
}
return new WriteResult(
WriteStatus.ERROR,
doc,
doc._id,
doc._rev,
workingResponse,
);
});
} else {
writeResults = docsArray.map((doc) => {
const idRev = doc._id + doc._rev;
const workingResponse = responsesByIdAndRev[idRev];
if (!workingResponse) {
// assume error when no response
return new WriteResult(
WriteStatus.ERROR,
doc,
doc._id,
doc._rev,
null,
);
} else if (workingResponse.ok) {
return new WriteResult(
WriteStatus.OK,
doc,
doc._id,
doc._rev,
null,
);
}
return new WriteResult(
WriteStatus.ERROR,
doc,
doc._id,
doc._rev,
workingResponse,
);
});
}
return callback(null, writeResults);
},
);
}
After the code executes, I expect the SGW to contain NO conflicts since the REVs are in-tact.
Inside the docsBulkWrite
callback:
error
is null
writeResults
are all ok
HOWEVER, the database testing_1525110520386
- which is the source database, shows conflicts as you can see in the picture below:
Extra: Here is the raw document in question (parent = -1):
I would expect the document to have NO conflicts.
Why does it happen? Is it an SGW error?
NB, the documentation for SGW regarding replication is not-for-humans. Every time replication topic comes back, it makes me sick. Especially that meaningless nesting of variables/arrays/objects such as "ok" in JSON data structures.
I want to support offline-first functionality in my app.
What kind of app is this? I take it this isn't using a Couchbase Lite SDK? Is this a Phonegap-style app with "custom sync" to Sync Gateway?
Client A was offline for 1 hour.
Is Client A creating conflicting revisions with it's own changes, or changes from another client?
From a high level design perspective, I would say that conflicts should only arise during concurrent offline updates to a doc by different devices:
at this point they will need to push conflicting revisions that are in conflict, since they diverged off of "doc1, rev1-a".
OTOH, if the updates are on a single device, then you would always base changes on the previous revision, and you'd get a linear (non-conflicting) history of updates.
Just want to focus on the high level usage before trying to drill in on the lower level potential API bugs.
I think I see what's happening:
Extra: Here is the raw document in question (parent = -1):
So it's creating three separate root revisions. In other words, your revision 2-A2
is being specified as having no parent, so it ends up as a root revision.
In the documentation for the bulk_docs
endpoint, it says:
_rev | stringRevision identifier of the parent revision the new one should replace. (Not used when creating a new document.)
What might be happening this that you are sending _rev: 2-A2"
but Sync Gateway doesn't have an existing _rev: 2-A2
, it's probably creating a new root revision for it. Regardless though, I'm pretty sure that you aren't' using the SG replication API the same way that Couchbase Lite is doing.
Overall, it looks like you are recreating a lot of the functionality that Couchbase Lite provides rather than using Couchbase Lite, since maybe it doesn't suit your use case. If that's the case, you'll probably have to dig deeper into the Couchbase Lite code to see how it works (before it moved to the websocket based replication protocol, if you want to push over HTTP/REST).
Another approach that might be easier to read the Couchbase Lite code is just to run the Todo sample app and sniff the replication using tools like:
That way you'll be able to see examples of the replication API usage.
Closing the issue because I don't see enough evidence of this being a Sync Gateway bug.
@episage one other thing that might interest you is the /{db}/_revtree/{doc}
endpoint:
It returns a graphviz diagram of the revtree that can be visualized using graphviz (or webgraphviz.com if you don't want to install anything)
Some examples of what the rendered revtrees look like (some with multiple-roots): https://github.com/couchbase/sync_gateway/issues/2847#issuecomment-326402432
@tleyden I'm writing SGW JS client (for browsers). Thanks for the links
Have you checked out PouchDB?
You could probably leverage a lot of their existing work .. or at least use it as comparison for debugging (especially coupled w/ the network sniffing tools I posted).
There has been few SGW tickets from people who are using PouchDB and ran into some minor compatibility issues.
@tleyden I did many times and had many attempts at it but it's not compatible with SGW. It used to be a couple of years ago.
and the minor
compatibility issues are really major
Ok good to know
the documentation for SGW regarding replication is not-for-humans...Especially that meaningless nesting of variables/arrays/objects such as "ok" in JSON data structures.
@episage could you share a link to an example of that? It probably needs to be updated.
@jamiltz AFAIK Couchbase/Sync Gateway don't have any documentation regarding writing a replicator. Thus, it's very difficult to write one.
I made a lot of effort to combine CouchDB documentation and own findings to write a working replicator in JS (browser-compatible JS).
By working
, I mean one that doesn't contain bugs. It's very easy to make one in the replication protocol.
In terms of conflict resolution the only article that has some meaningful code and notes is this one:
https://developer.couchbase.com/documentation/mobile/current/guides/sync-gateway/resolving-conflicts/index.html
In terms of quality of code in the article, ehm... it's disputable. I had hard time understanding how to resolve a conflict and how to create it.
Why is the article hard to comprehend?
callback-hell
_revisions
parameter which is crucial when creating/updaing documents (that was my problem)@episage Thanks for the detailed feedback! Those are all valid points and noted https://issues.couchbase.com/browse/DOC-3575
Sync Gateway version
Operating system
Config file
default, nothing changed, look at
command
aboveLog output
Expected behavior
SGW should not return conflicts for
"rev": "2-A"
and"rev": "1-A"
when querying:http://localhost:5002/testing_1525080771752/_changes?active_only=true&style=all_docs
becasue each replicated document via post_bulk_docs has a unique revision and numberActual behavior
SGW returns conflicts on
http://localhost:5002/testing_1525080771752/_changes?active_only=true&style=all_docs
. Every single document is conflicted (probably because each document's parent is equal to -1).Steps to reproduce