Open makhdumi opened 7 years ago
The change to remove the 'ns': { $in: [...] }
filter was not to support wildcards in namespaces, we can always use a $regex query for that. It was removed because if we filter out ignored collections, the oplog might become completely filled with operations that don't match our query. mongo-connector would then abort with an error because its last seen oplog entry is no longer there.
Although it might be possible to add back the 'ns': { $in: [...] }
filter if we periodically update the checkpoint the latest ignored entry. Hmmmm, I'll have to get this a little more thought but I'd like to add it back if it improves the performance significantly.
The oplog query used to have a
'ns': { $in: [...] }
filter. This was removed at some point, and filtering on namespace is now done in code (python).But this can be quite slow, e.g. with databases containing many collections or if there're other collections that "pollute" the oplog a lot more than the collections the connector is interested in. It's much faster, at least on our cluster, to let the MongoDB server do the filtering. Without the cursor filter, for me, the connector lags by 1000 seconds-12000 seconds with a pretty low update throughput to MongoDB - about ~30 updates / ~second~ minute.
I understand that namespaces are now configurable with wildcards/regex, but if no wildcard/regex is specified in the config, then could the connector go back to doing the filtering on the cursor?