apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.16k stars 855 forks source link

[core] Deletion Vectors mode supports lookup async #3510

Closed JingsongLi closed 1 month ago

JingsongLi commented 1 month ago

Purpose

In the previous implementation, due to concurrent thread safety issues, asynchronous execution could not be enabled for the deletion vectors mode, as there were also an asynchronous compaction thread to write deletion vectors when the main thread materialize them.

We can avoid this situation by handing over the materialization of the deletion vectors to the asynchronous compaction thread. After each compaction is completed, we actively materialize the deletion vectors as a deletion file.

Tests

PrimaryKeyFileStoreTableITCase already coverred this.

API and Format

Documentation

/primary-key-table/deletion-vectors modified.