apache / iceberg-python

Apache PyIceberg
https://py.iceberg.apache.org/
Apache License 2.0
381 stars 140 forks source link

Support metadata compaction #270

Open Fokko opened 8 months ago

Fokko commented 8 months ago

Feature Request / Improvement

Add support for compaction. This rewrites the existing manifests into a single one, reducing the number of calls to the object store. This should follow the Java configuration keys:

HonahX commented 7 months ago

I am interested in taking this if no one has started working on it.

HonahX commented 6 months ago

Based on offline discussion with @Fokko, I will first focus on implementing the MergeAppend which supports these keys

The MergeAppend will become the default append method since commit.manifest-merge.enabled is default to True. The PR for MergeAppend is https://github.com/apache/iceberg-python/pull/363

BTW, it seems rewrite_manifest operations only depends on the commit.manifest.target-size-bytes. Shall we update the description to reflect this?

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] commented 4 days ago

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'