apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.37k stars 937 forks source link

[Feature] Decouple the PartitionMarkDone from Flink to enable marking as done for Spark batch processing. #4361

Open Aitozi opened 3 hours ago

Aitozi commented 3 hours ago

Search before asking

Motivation

Currently, the partition mark done is coupled with flink engine. I think we should move the related code into paimon-core. After this, we could support mark done for spark batch processing.

Solution

No response

Anything else?

No response

Are you willing to submit a PR?

Aitozi commented 2 hours ago

We could directly use PartitionTriggerAction in Spark writer.