pingcap / tiflow

This repo maintains DM (a data migration platform) and TiCDC (change data capture for TiDB)
Apache License 2.0
429 stars 287 forks source link

Use disk to buffer data before writing to sink #1096

Open liuzix opened 3 years ago

liuzix commented 3 years ago

Feature Request

Is your feature request related to a problem? Please describe: When the downstream cannot handle the writes in time, usually due to a large throughput from the upstream, CDC will OOM because the sink buffers data only in memory.

Describe the feature you'd like:

Describe alternatives you've considered:

zhangjinpeng87 commented 3 years ago

How to guarantee the durability of these data? How about ticdc node crash and data loss.

overvenus commented 3 years ago

How to guarantee the durability of these data? How about ticdc node crash and data loss.

TiCDC does not guarantee data durability that buffered in a disk. If a node crash, other nodes can pull data from TiKV again.

However, we do need to guarantee data integrity, maybe a checksum is enough just like RocksDB does.