PeerDB-io / peerdb

Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage
https://peerdb.io
Other
2.25k stars 92 forks source link

Clickhouse: change defaults for batch sizes #1988

Closed Amogh-Bharadwaj closed 2 months ago

Amogh-Bharadwaj commented 3 months ago

Change snapshot num rows per partition and max batch size defaults in UI to 100K instead of 1 Mil

iskakaushik commented 3 months ago
  1. Change CDC batch size to 500K
  2. Change QRep batch size to 250L
  3. Have an dyn conf var PEERDB_CLICKHOUSE_EXPERIMENTAL_S3_IIS_PARTS default to 1 if 1 do whatever is happening now, if > 2 use that as numParts for both CDC and Qrep.
heavycrystal commented 3 months ago

probably better to use negative values/0 to indicate disabled and positive values to indicate numParts

Amogh-Bharadwaj commented 3 months ago

QRep batch size is already 100K @iskakaushik

serprex commented 3 months ago

https://github.com/ClickHouse/data-plane-application/pull/12971 I'll make PR edit: #1999

heavycrystal commented 2 months ago

not necessary right now