Open adubovikov opened 2 years ago
How to implement it if you have 100 shards? You need to check during insertion with 99 servers that they do not have these ids. And the same time insertion pace can be 10 millions rows per server per second.
yeah, in the distribution model you should use unique shard key and similar like dictionary lookup concept.
In couple scenarios you should check if your key is unique and don't insert same values multiple times, even into ReplacingMergeTree. Ideally will be create a secondary/unique index which allows SQL syntax like ON CONFLICT. i.e. check if the timestamp of existing key is older and after do insert.