memiiso / debezium-server-iceberg

Replicates any database (CDC events) to Apache Iceberg (To Cloud Storage)
Apache License 2.0
185 stars 35 forks source link

Deduplicator for exactly-once delivery #290

Closed julianpark90 closed 6 months ago

julianpark90 commented 6 months ago

In events where a crash restart occurs, or within a HA environment using multiple Debezium servers on the same data source, there's a risk of data duplication. Implementing exactly-once delivery support is crucial to mitigate this issue. An integration with the Iceberg deduplicator would be beneficial for ensuring data consistency and integrity more effectively.

As I'm new to this field, any guidance or suggestions on how to address this issue would be greatly appreciated.