lightblue-platform / lightblue-mongo

MongoDB backend for lightblue
GNU General Public License v3.0
3 stars 14 forks source link

When cfg.isReevaluateQueryForRetry is enabled, docs that no longer match the query are always considered concurrent update failures #365

Closed alechenninger closed 7 years ago

alechenninger commented 7 years ago

These shouldn't be considered failures I think, because the update stipulates it should only run if the query matches. If the query does not match, the update should not be run, so the update not running is not a failure, it is expected.

This happens with update-if-current algorithms, so for example if I update where timestamp=X, setting timestamp to Y, and two threads interleave in their find-then-write TXs, the second will always retry and eventually fail. There is no reason to fail, because without concurrent updates one of the queries would still not match (that is expected).

See here:

    private List<Integer> retryFailedDocs(List<Integer> failedDocs,Map<Integer,Error> results) {
        List<Integer> newFailedDocs=new ArrayList<>(failedDocs.size());
        for(Integer index:failedDocs) {            
            BatchDoc doc=batch.get(index);
            // Read the doc
            DBObject findQuery=new BasicDBObject("_id",doc.id);
            if(cfg.isReevaluateQueryForRetry()) {
                if(query!=null) {
                    List<DBObject> list=new ArrayList<>(2);
                    list.add(findQuery);
                    list.add(query);
                    findQuery=new BasicDBObject("$and",list);
                }
            }
            DBObject updatedDoc=collection.findOne(findQuery);
            if(updatedDoc!=null) {
                // if updatedDoc is null, doc is lost. Error remains
                DBObject newDoc=reapplyChanges(index,updatedDoc);
                // Make sure reapplyChanges does not insert references
                // of objects from the old document into the
                // updatedDoc. That updates both copies of
                // documents. Use deepCopy
                if(newDoc!=null) {
                    DBObject replaceQuery=writeReplaceQuery(updatedDoc);
                    // Update the doc ver to our doc ver. This doc is here
                    // because its docVer is not set to our docver, so
                    // this is ok
                    DocVerUtil.setDocVer(newDoc,docVer);
                    // Using bulkwrite here with one doc to use the
                    // findAndReplace API, which is lacking in
                    // DBCollection
                    BulkWriteOperation nestedBwo=collection.initializeUnorderedBulkOperation();
                    nestedBwo.find(replaceQuery).replaceOne(newDoc);
                    try {
                        if(nestedBwo.execute().getMatchedCount()==1) {
                            // Successful update
                            results.remove(index);
                        }
                    } catch(Exception e) {
                        newFailedDocs.add(index);
                    }
                } else {
                    // reapllyChanges removed the doc from the resultset
                    results.remove(index);
                }
            }
        }
        return newFailedDocs;
    }

When the updatedDoc is not found (still null), it is not removed from the results map of failures. So, the retry loop (in retryConcurrentUpdateErrorsIfNeeded) will always detect these, and always retry.