Closed AaronKalair closed 5 months ago
Hey,
I did some more debugging today and figured out whats going on here....
The collections I was transferring from DocDB -> Sharded MongoDB were not getting any updates
The script buffers updates from the Change Stream and then Bulk Writes them if either:
However this check is only executed when an event arrives on the Change Stream
So because these collections are not being actively written to the final batch of documents are never written.
If I write a fake document to the collection, the script realises that more than 5 seconds have passed since the last write and flushes the buffer of documents
I guess not really a bug, more a misunderstanding of how its supposed to work, so closing this
👋 Hey,
We're attempting to use the Migrator Tool with Change Streams to move data from DocumentDB to Sharded MongoDB (i.e. we're issuing the writes to a MongoS instance)
It appears to work for the most part, but misses a few 10's of documents so not every document is transferred.
For example our
exchanges.league_event
collection in DocumentDB has681,653
documents.The migrator tool moves
681,601
documents to MongoDB and then the remaining52
documents are never transferredThese are the logs from the tool
The logs suggest it gets to
681,601
documents and then thesecs behind
just keeps growing.We start the tool to monitor the Change Streams and transfer documents, minutes before we start inserting documents into the DocDB, so it shouldn't be a case of those 52 documents have expired from the Change Stream.
We see the exact same thing with another collection where there are
340,375
documents in DocDB and the tool only moves340,301
documents to MongoDB (missing74
)