apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.44k stars 959 forks source link

[Feature] Decouple the PartitionMarkDone from Flink to enable marking as done for Spark batch processing. #4361

Closed Aitozi closed 1 month ago

Aitozi commented 1 month ago

Search before asking

Motivation

Currently, the partition mark done is coupled with flink engine. I think we should move the related code into paimon-core. After this, we could support mark done for spark batch processing.

Solution

No response

Anything else?

No response

Are you willing to submit a PR?

Aitozi commented 1 month ago

We could directly use PartitionTriggerAction in Spark writer.

wwj6591812 commented 1 month ago

Good idea!