Since the beginning of this tool the '--oplog' flag is passed to mongodump. Our Python code ALSO tails the oplog and merges the mongodump+Python oplog at the end.
This was done because I didn't want to make any assumptions about how mongodump achieved consistency via the oplog. In hindsight, it isn't doing anything 'special' aside from dump the oplog from before the collection dumping to the end of backup.
I'm now confident that removing the '--oplog' dumping in mongodump will achieve the same consistency due to MongoDB's replication design. This also prepares us for backup methods that don't support oplog tailing, such as block snapshots.
This means we can disable oplog tailing in 'mongodump'.
This will require:
'--oplog' flag is NOT passed to mongodump.
Logic for merging the mongodump and Python oplog is removed from Oplog/Resolver/*.py.
Benefits:
Oplog is not dumped twice, this is lighter on MongoDB nodes.
Less resources+space used during backup on the host running MCB.
Faster and simpler Oplog Resolver stage - it only needs to trim the oplogs to a consistent time, not merge and trim.
Since the beginning of this tool the '--oplog' flag is passed to mongodump. Our Python code ALSO tails the oplog and merges the mongodump+Python oplog at the end.
This was done because I didn't want to make any assumptions about how mongodump achieved consistency via the oplog. In hindsight, it isn't doing anything 'special' aside from dump the oplog from before the collection dumping to the end of backup.
I'm now confident that removing the '--oplog' dumping in mongodump will achieve the same consistency due to MongoDB's replication design. This also prepares us for backup methods that don't support oplog tailing, such as block snapshots.
This means we can disable oplog tailing in 'mongodump'.
This will require:
Benefits: