cloudant / java-cloudant

A Java client for Cloudant
Apache License 2.0
79 stars 68 forks source link

Design document synchronization doesn't work #307

Closed srinivasannanduri closed 7 years ago

srinivasannanduri commented 8 years ago

Cloudant Client: 2.5.0 onwards. The latest version also has the same issue Java Version: 1.7 OKHTTP: No Code Sample:

ClientBuilder builder = ClientBuilder.account("ssss").username("ssss").password("sss");
CloudantClient client = builder.build();
Database db = client.database("ssss", true);
DesignDocumentManager designDocManager = db.getDesignDocumentManager();
InputStream designDocStream = 
                this.getClassLoader().getResourceAsStream(designDocFileName.toString());
            DesignDocument designDocument = 
                gson.fromJson(
                    new InputStreamReader(designDocStream, StandardCharsets.UTF_8), DesignDocument.class);
            designDocManager.put(designDocument);

Problem: Irrespective of whether the design document with the same content exists or not, the design document is always updated with a new revision of the document. When the Cloudant database is huge, this change in the design document is causing the database to re-index causing database to fail during the indexing time. The issue is that the Document comparison is expecting a revision even before the revision is set in the code. I will send out a PR - please verify and let me know.

Try with a simple design document and the problem is reproducible.

srinivasannanduri commented 8 years ago

Unable to create a PR but here is the problem.

DesignDocumentManager has the following code:


 if (!document.equals(documentFromDb)) {
            document.setRevision(documentFromDb.getRevision());
            return db.update(document);
        }

The fix should be to set the revision before the equals check as in:

document.setRevision(documentFromDb.getRevision());
 if (!document.equals(documentFromDb)) {
            return db.update(document);
        }

Why? Because document.equals() checks for the revision and the revision is null and the equals check fails.

ricellis commented 8 years ago

I think this function is working as designed, although clearly the documentation needs improving.

I believe we intended that the Response from the designDocManager.put() should be used to update your on-disk version of the design document with the appropriate revision after a successful write, otherwise the on-disk version doesn't truly represent the server version. e.g.

Response r = designDocManager.put(designDocument);
if (r != null && r.getRev() != null) {
    designDocument.setRevision(r.getRev());
    // Write file with new revision
    FileWriter writer = new FileWriter(designDocFileName);
    try {
        writer.write(gson.toJson(designDocument));
    } finally {
        writer.close();
    }
}

Then on subsequent compare operations the revision will be correct for the equality comparison.

IIRC we decided not to do this on-disk revision update automatically because we felt it was inappropriate for us to modify or overwrite files belonging to the application and we should treat them as read-only, leaving it up to the application to do any writes. Hence why we only have a reader from a File to a DesignDocument with DesignDocumentManager.fromFile().

The alternative behaviour you are suggesting is to manipulate the in-memory view of the design document with whatever revision ID is present on the server to perform the equality. I think this is incorrect because if the local copy has, for example, revision 2-xyz but the server has revision 1-xyz even if the content is the same the update should be performed because the local copy is apparently a different revision.

The line document.setRevision(documentFromDb.getRevision()) sets the revision ID so that the update happens on the correct revision in the remote and is unrelated to the comparison of the revision IDs.

srinivasannanduri commented 8 years ago

If the equality check is based on revision, why is the equals doing the view files (including map, reduce and index) for equality. It could well check the revision and return. Updating the design doc on filesystem is good as long as the views are on the native file system - what if the views are in a jar?

ricellis commented 8 years ago

It could well check the revision and return.

That assumes that the revision ID has been generated from the document source and follows the same algorithm as the server (i.e. the way CouchDB generates the revision ID). However, we cannot enforce that on the Java object and it is possible the revision ID read from a file or applied by a call to the revision ID setter is any arbitrary value regardless of the document content. Under those circumstances DesignDocument instances with the same revision ID would not be equal so we felt it necessary to check the content.

what if the views are in a jar?

So if I understand your use case correctly: you have a a design document in a jar and you read it into a DesignDocument object and you want to update the server version if the content is different without checking the revision ID of the local document?

A workaround would be to apply your suggested fix in your application code and set the revision ID of the local DesignDocument to the same value as the server before calling put e.g.

try {
  // Get the remote DesignDocument
  DesignDocument remoteDesignDocument = designDocManager.get(designDocument.getId());
  // Set local revision to remote value
  designDocument.setRevision(remoteDesignDocument.getRevision());
} catch (NoDocumentException e) {
  // If the design document doesn't exist then need to put anyway
}
Response r = designDocManager.put(designDocument);

If the remote DesignDocument is large and you don't want to fetch the content twice it is possible to use a HEAD request to get the design document revision from the ETag header, similarly to this code.

If there was enough interest in this item we could possibly consider an enhancement for an overloaded put(DesignDocument ddoc) that took an additional argument of an interface that could compare design documents and return a boolean, for example, to allow for customization of the comparison used before uploading.