rwynn / monstache

a go daemon that syncs MongoDB to Elasticsearch in realtime. you know, for search.
https://rwynn.github.io/monstache-site/
MIT License
1.29k stars 182 forks source link

monstache - problem reading from MongoDB secondaries. #724

Open arekborucki opened 5 months ago

arekborucki commented 5 months ago

We have observed a problem when our Monstache instance uses MongoDB read preference "secondary". When we try to insert a large number of documents into MongoDB in a short time, such as 5,000 documents, not all of them are replicated to Elasticsearch. Approximately 4,950 to 4,970 are replicated. However, when we switch Monstache back to read preference "primary", everything is replicated correctly. All documents are also correctly replicated if we use the connection string to MongoDB with only the secondary MongoDB node name and the parameterdirectConnection=true. However, in this case, Monstache cannot insert metadata to the MongoDB database.

Monstache uses a MongoDB view as the replication source in MongoDB. Here is our configuration:

elasticsearch-urls = ["${elastic_url}"]
relate-threads = 6000
relate-buffer = 15000
elasticsearch-max-seconds = 10
elasticsearch-max-bytes = 16777216
resume = true
resume-name = "bc2-${resume_id}-contacts"
change-stream-namespaces = [ "${mongodb_database}.contacts" ]
gzip = true
stats = true
elasticsearch-retry = true
prune-invalid-json = true
dropped-databases = false
dropped-collections = false
elasticsearch-client-timeout = 30
enable-http-server = true

[[mapping]]
namespace = "${mongodb_database}.contacts"
index = "contacts${es_suffix}"

[[mapping]]
namespace = "${mongodb_database}.contacts-view"
index = "contacts${es_suffix}"

[[relate]]
namespace = "${mongodb_database}.contacts"
with-namespace = "${mongodb_database}.contacts-view"
keep-src = false

Could the problem be that Monstache sees the document ID in the oplog, takes that document ID, and sends a query to one of the secondaries, e.g., db.contact-view.find({"id":"xyz"}). However, the document is not yet replicated despite being in the oplog, so it gets zero documents as a result of the query ?

we use MongoDB v6.0 , Monstache v6.7.10

db.adminCommand({ getDefaultRWConcern: 1 })
{
  defaultReadConcern: { level: 'local' },
  defaultWriteConcern: { w: 'majority', wtimeout: 0 },
  updateOpTime: Timestamp({ t: 1717599777, i: 6 }),
  updateWallClockTime: ISODate("2024-06-05T15:02:57.764Z"),
  defaultWriteConcernSource: 'global',
  defaultReadConcernSource: 'implicit',
  localUpdateWallClockTime: ISODate("2024-06-05T15:02:57.765Z"),
  ok: 1,
  '$clusterTime': {
    clusterTime: Timestamp({ t: 1717673598, i: 4 }),
    signature: {
      hash: Binary(Buffer.from("0000000000000000000000000000000000000000", "hex"), 0),
      keyId: Long("0")
    }
  },
  operationTime: Timestamp({ t: 1717673598, i: 4 })
}
hmsta commented 3 months ago

interesting finding. I was just debugging a similar issue about missing some documents, since I've had used readPreference=nearest so far ..

switched to primary now, since your explanation totally makes sense, especially since I have set the writeConcern to primary only.

        "defaultWriteConcern" : {
                "w" : 1,
                "wtimeout" : 0
        },

Thanks for pointing that out, probably saved me a few hours of debugging :-)