apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.1k stars 834 forks source link

[core] Support delete by default in partial updates #3602

Open yunfengzhou-hub opened 1 week ago

yunfengzhou-hub commented 1 week ago

Purpose

Linked issue: close #3048

This pull request supports dealing with deletion records in partial update merge engine.

Tests

Unit tests are added in PrimaryKeyFileStoreTableTest to verify the changes in this PR

API and Format

This pull request affects the default behavior of partial-update merge engine in cases where deletion records exist. Given that the existing behavior is to throw exception in this case, the changes in this PR are backward compatible.

Documentation

This PR requires to update the document about merge engine, which has been included in the commit.

yunfengzhou-hub commented 5 days ago

Hi @JingsongLi Could you please take a look at this PR?