itinycheng / flink-connector-clickhouse

Flink SQL connector for ClickHouse. Support ClickHouseCatalog and read/write primary data, maps, arrays to clickhouse.
Apache License 2.0
363 stars 154 forks source link

CDC日志(更新操作)实时写入clickhouse,存在数据错误 #54

Closed LINhunger closed 11 months ago

LINhunger commented 1 year ago

假如有以下链路: mysql-cdc(debezium-json) -> clickhouse

debezium-json会将cdc变更日志中的更新操作,转化为2条操作(-Delete +Insert)

当这个事件流通过flink-clickhouse-connector实时写入clickhouse时,因为-Delete事件触发的删除操作是异步的,会导致执行顺序错乱,比如会先执行+Insert操作,再执行-Delete操作, 导致这种场景下会出现数据问题,表现在:当有更新操作时,最终体现在clickhouse会表现为删除操作。

itinycheng commented 1 year ago

@LINhunger 嗯嗯,这个问题和https://github.com/itinycheng/flink-connector-clickhouse/issues/24 类似; 设置过sink.ignore-delete = true后应该不会生成delete语句哈; 如果测试确实有问题,麻烦提供下场景明细,我看下自己是否能重现问题;

itinycheng commented 1 year ago

修复了个sink.ignore-delete配置不生效的问题,可能与这个问题有关;https://github.com/itinycheng/flink-connector-clickhouse/commit/06d0aedb1cb606d85cb328f4bbe6befd0892d4ca