alibaba / MongoShake

MongoShake is a universal data replication platform based on MongoDB's oplog. Redundant replication and active-active replication are two most important functions. 基于mongodb oplog的集群复制工具,可以满足迁移和同步的需求,进一步实现灾备和多活功能。
GNU General Public License v3.0
1.72k stars 441 forks source link

unique key conflict within the collection #413

Closed hiep1097 closed 3 years ago

hiep1097 commented 4 years ago

@vinllen Conflict Detection feature not supported in MongoShake opensource version. So what happen if unique key conflict within the collection occur? Is the consistency can be guarantee?

vinllen commented 4 years ago

Yes, it can be guaranteed. The hash function is based on the collection as default(incr_sync.shard_key = collection in configuration collection.conf). So the oplog order on one collection can be guaranteed, as a result, the unique key conflict won't happen. However, if you change the incr_sync.shard_key to id, the order of oplog can't be guaranteed even if two oplogs related to one document. So, the unique key conflict may happen. Generally speaking, this configuration is always used when there is no unique key except _id exists on the collection, and it can keep more fine-grained concurrency than hash by collection to increase the syncing performance.

hiep1097 commented 4 years ago

Yes, it can be guaranteed. The hash function is based on the collection as default(incr_sync.shard_key = collection in configuration collection.conf). So the oplog order on one collection can be guaranteed, as a result, the unique key conflict won't happen. However, if you change the incr_sync.shard_key to id, the order of oplog can't be guaranteed even if two oplogs related to one document. So, the unique key conflict may happen. Generally speaking, this configuration is always used when there is no unique key except _id exists on the collection, and it can keep more fine-grained concurrency than hash by collection to increase the syncing performance.

Thanks. But I think you should update your document more clearly. In the document: "in the case of collection-level concurrency, it is necessary to be able to distribute the concurrency evenly and also to solve the case of a unique key conflict within the collection", it make me think that the unique key conflict may occur on collection-level concurrency.

vinllen commented 4 years ago

Thanks for your advice, I'll do it later.