apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.42k stars 951 forks source link

[Feature] Delete old schema file #3449

Open eric666666 opened 5 months ago

eric666666 commented 5 months ago

Search before asking

Motivation

Now, paimon cannot delete schema files, even old schema version file is not related to any snapshot. It is not friendly for our situation that we will set a time attribute fields in table property to let other server know data synchronization progress. This will make each property update operation trigger a new paimon schema file. image image

Solution

I find a issue desribed same problem https://github.com/apache/paimon/pull/2662. This issue suggest use procedure to manual delete schema files. But I don't see any process.

Anything else?

No response

Are you willing to submit a PR?

tsreaper commented 3 months ago

Even if a schema file is not the current schema of any snapshots, some data files might still be using this schema version. I see that you're willing to submit a PR, could you share your solution to this problem?