AllDocsRequest doesn't contain deleted

al-tr commented 6 years ago

CloudantClient (java-cloudant) version(s) that are affected by this issue - 2.12.0 (and earlier)
Java version (including vendor and platform) - Java 8 - IBM J9 VM
If you're using the optional okhttp dependency - yes
A small code sample that demonstrates the issue - below

There's a _deleted field for tombstones in _all_docs when you request doc by key. and there's no way to get information about it via current API:

al-tr commented 6 years ago

Use case: bulk accepts documents both with and without _rev. 1 if there's no _rev, bulk "creates" documents as new documents 2 if there's _rev, bulk "updates" documents as existing documents

we don't know whether the documents exist or not

for successful "create" (1) we need to provide no _rev, easy for successful "update" (2) documents should contain _rev, in case of bulk the business is around huge quantities of documents and the best way to get _revisions for all the documents we need is the fastest call -> _all_docs with keys=["...", ..., "..."]&include_docs=false. So there's a method in API getIdsAndRevs().

But if the document is deleted it has a tombstone for some time and _all_docs+keys will return id+rev pair, but also is the flag.

If we bulk a document with an id, and it's _deleted... nothing will happen. it won't appear again. in this case, when we have tombstone we need to "create" (1) a document without _rev, but there's no way to understand it via current (2.12.0) AllDocsRequest API where _deleted flag is ignored.

emlaver commented 6 years ago

@al-tr The workaround is to use executeRequest. See the java-cloudant documentation for an example. e.g.

//Simple POJO
Foo foo = new Foo();
foo.set_id("foo-doc");
// Save the document
Response r1 = db.save(foo);
// Set rev from response to foo object
foo.set_rev(r1.getRev());
// _deleted=true
foo.set_deleted(true);
List<Object> deleteDoc = new ArrayList<Object>();
deleteDoc.add(foo);
// call _bulk_docs for deletion of foo
db.bulk(deleteDoc);
// request _all_docs containing _deleted fields
// e.g. http://account.cloudant.com/database/_all_docs?keys=["e831a35f60794804b21a6391001f90df"]&include_docs=true'
HttpConnection resp = client.executeRequest(
        Http.GET(new URL(db.getDBUri() + "/_all_docs?keys=[\"foo-doc\"]&include_docs=true")));
List<String> docs = null;
//Handle response, extract a list of docs
try (BufferedReader buffer = new BufferedReader(new InputStreamReader(resp.getConnection().getInputStream()))) {
    docs = buffer.lines().collect(Collectors.toList());
}

Our team has plans to cover the API gaps in the next version.

al-tr commented 6 years ago

hi, @emlaver thanks for your reply, but in our case we need to get only id-rev for existing documents (by keys), but the method also returns for _deleted ones. using executeRequest automatically involves creating a solution for #196 too long URL. we did this with the first version of java cloudant library and formed multiple URLs shorter than 3000 symbols to keep it stable, but, I hope, you got my point :)

ricellis commented 6 years ago

If you're using executeRequest you can POST the keys instead of adding them to the query string, thereby avoiding the URL length limitation. e.g.

CloudantClient c = ClientBuilder.account("examples").build();
Database db = c.database("animaldb", false);
HttpConnection allDocsPost = Http.POST(new URL(db.getDBUri() + "/_all_docs"),"application/json");
allDocsPost.setRequestBody("{\"keys\": [\"cat\"]}");
System.out.println(c.executeRequest(allDocsPost).responseAsString());

{"total_rows":11,"offset":null,"rows":[ {"id":"cat","key":"cat","value":{"rev":"2-eec205a9d413992850a6e32678485900","deleted":true}} ]}

tomblench commented 6 years ago

Merged to master; closing

cloudant / java-cloudant

AllDocsRequest doesn't contain deleted #427