ClickHouse / ClickHouse

ClickHouse® is a real-time analytics DBMS
https://clickhouse.com
Apache License 2.0
37.66k stars 6.91k forks source link

multi thread write same batch data to clickhouse ReplicatedMergeTree, will data duplicate? #31245

Closed lifulong closed 1 week ago

lifulong commented 3 years ago

we need to write data to clickhouse exectly once, want to use ReplicatedMergeTree to achieve this goal, but worry about data duplicate while do retry in some case

den-crane commented 3 years ago

You mean parallel into different CH servers? If you insert data into the replicas of a shard you identical (binary identical) inserts will be deduplicated. https://clickhouse.com/docs/en/operations/settings/merge-tree-settings/#replicated-deduplication-window

lifulong commented 3 years ago

parallel into same shard, maybe different replicate we have used replicated_deduplication_window conf now, consider this case, while write to one replicate, it write timeout for server busy, we try another replicate then