apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
6.46k stars 2.23k forks source link

Flink Merge On Read Behavior? Equality & Positional Deletes #11535

Open FranMorilloAWS opened 1 day ago

FranMorilloAWS commented 1 day ago

Query engine

Apache Flink

Question

Can somebody explain how Delete files are implemented with Apache Flink? Spark only makes use of Positional Deletes, but Apache Flink seems that we are using both? Not sure why Flink would need to use both?

pvary commented 1 day ago

The discussion here might help you: https://github.com/apache/iceberg/pull/10935#issuecomment-2294754336

FranMorilloAWS commented 13 hours ago

Hi @pvary , but still is not clear in which scenarios does flink decide to either do a positional delete or an equality delete when commiting the snapshot. Also i have seen snapshots commits that may have both. There is an Alibaba blog post that mentions this is due to avoid inconsistencies but again not clear and not documented anywhere: https://www.alibabacloud.com/blog/how-to-analyze-cdc-data-in-iceberg-data-lake-using-flink_597838

pvary commented 11 hours ago

Equality delete:

Positional delete:

FranMorilloAWS commented 10 hours ago

Why we need to use both? Is there an example scenario we can go over? Thanks in advanced for answering me :)

pvary commented 9 hours ago

Imagine a scenario where a specific Id is updated twice. Equality based delete is not enough in this case to remove the outdated first record and keep the second record. Positional delete is not enough in itself, since we need to find the data file and the specific rownum to delete the record. In edge cases this would require us to do a full table scan for every record...