arangodb / arangodb

🥑 ArangoDB is a native multi-model database with flexible data models for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.
https://www.arangodb.com
Other
13.57k stars 836 forks source link

ArangoSearch: Failed to retrieve document from view after updating it in collection #16347

Open mkv opened 2 years ago

mkv commented 2 years ago

My Environment

Steps to reproduce

  1. Create collection
  2. Create view:
    {
    "writebufferIdle": 64,
    "type": "arangosearch",
    "writebufferSizeMax": 33554432,
    "consolidationPolicy": {
    "type": "tier",
    "segmentsBytesFloor": 2097152,
    "segmentsBytesMax": 5368709120,
    "segmentsMax": 10,
    "segmentsMin": 1,
    "minScore": 0
    },
    "primarySort": [],
    "globallyUniqueId": "c35357188/",
    "id": "35357188",
    "storedValues": [],
    "writebufferActive": 0,
    "consolidationIntervalMsec": 1000,
    "cleanupIntervalStep": 2,
    "commitIntervalMsec": 1000,
    "links": {
    "entities": {
      "analyzers": [
        "norm_en",
        "identity"
      ],
      "fields": {
        "title": {}
      },
      "includeAllFields": false,
      "primarySortCompression": "lz4",
      "storeValues": "id",
      "trackListPositions": false
    }
    },
    "primarySortCompression": "lz4"
    }

    or

{
  "writebufferIdle": 64,
  "type": "arangosearch",
  "writebufferSizeMax": 33554432,
  "consolidationPolicy": {
    "type": "tier",
    "segmentsBytesFloor": 2097152,
    "segmentsBytesMax": 5368709120,
    "segmentsMax": 10,
    "segmentsMin": 1,
    "minScore": 0
  },
  "primarySort": [],
  "globallyUniqueId": "c18297111/",
  "id": "18297111",
  "storedValues": [],
  "writebufferActive": 0,
  "consolidationIntervalMsec": 1000,
  "cleanupIntervalStep": 2,
  "commitIntervalMsec": 1000,
  "links": {
    "entities": {
      "analyzers": [
        "norm_en",
        "identity"
      ],
      "fields": {},
      "includeAllFields": true,
      "primarySortCompression": "lz4",
      "storeValues": "id",
      "trackListPositions": false
    }
  },
  "primarySortCompression": "lz4"
}
  1. Continuously run query to get document

    FOR v IN search_view SEARCH v._key == 'some_key' RETURN v._key
  2. Update some field in a document in the collection. The field may be linked to the view or not.

    UPDATE { _key: 'some_key', some_field:'value1' } IN entities

Problem: After update we cannot receive document (looks like it is disappeared) from view for 2-3 seconds.

Expected result: Last version of the document is available all the time, independently of any updates.

Delicious-Bacon commented 2 years ago

I'm having this issue as well. I use importDocuments with onDuplicateUpdate option.

Here's the kind of results I get:

The expected result: [{"id":"348","t":"Item 348"}, {"id":"112","t":"Item 112"}, {"id":"225","t":"Item 225"}]

I run the latest 3.9.2 version.

I use Golang SDK and use ImportDocuments function with ImportOnDuplicateUpdate preference if SDK matters.

I grab document _keys in a subquery and then loop over the subquery _keys array to "JOIN" the results with another collection so this looks like the document first exists because the subquery finds the _key and then disappears when querying for it in the second query.

I can't rely on Arango giving partial or incorrect results during updates.


You mentioned that it

(looks like it is disappeared) from view for 2-3 seconds.

Yes, it looks like it does REPLACE where it would first DELETE the document and then INSERT a new copy instead of doing an UPDATE.

Note: while I thought this might have been an eventual consistency issue, it makes no sense, because in that case, the query would just return the old version of the document, not null.

MBkkt commented 1 year ago

answered here https://github.com/arangodb/arangodb/issues/17482

MBkkt commented 1 year ago

Fixed starting from 3.11