yougov / mongo-connector

MongoDB data stream pipeline tools by YouGov (adopted from MongoDB)
Apache License 2.0
1.88k stars 479 forks source link

BinData subtype 4 (UUID) converted to subtype 3 (UUID old) #721

Open dsem opened 7 years ago

dsem commented 7 years ago

I am using mongo-connector to sync two mongo databases. The _ids in the source database are of type BinData subtype 4 (UUID). In the destination mongo database, they are inserted as BinData subtype 3 (UUID old).

The subtype of the BinData should be preserved when using mongo-connector between two mongo databases.

ShaneHarvey commented 7 years ago

PyMongo automatically decodes both subtype 3 and 4 into Python uuid.UUIDs. When encoding the UUID back into a bson binary the subtype is configurable. I think you'll want to use uuidRepresentation=standard in your mongodb connection strings:

$ mongo-connector -m 'mongodb://source:27017/?uuidRepresentation=standard' -t 'mongodb://target:27017/?uuidRepresentation=standard' -d mongo_doc_manager

In this configuration, mongo-connector will insert both BinData subtype 4 (UUID) and BinData subtype 3 (old UUID) as BinData subtype 4 (UUID). PyMongo does not provide a way to preserve the old and new UUID subtype at the same time.