apache / iceberg-python

Apache PyIceberg
https://py.iceberg.apache.org/
Apache License 2.0
487 stars 177 forks source link

Support to optimize, analyze tables and expire snapshots, remove orphan files #31

Open Fokko opened 1 year ago

Fokko commented 1 year ago

Feature Request / Improvement

Migrated of https://github.com/apache/iceberg/issues/8183

jayceslesar commented 1 year ago

Removing orphaned files should include a change that allows building the fs directly from the table no? Seems odd to have that separate and contained only in the project_table ?

carcmarc commented 9 months ago

Any progress on this issue? Seems core to table manipulation!

github-actions[bot] commented 3 months ago

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

eedduuar commented 2 months ago

Hello, any progress?

Samreay commented 3 weeks ago

Ah its me again! Just wondering if this is planned at all, or if the recommendation to perform table maintenance would be to spin up a spark cluster or similar to expire snapshots and clean things up?

ndrluis commented 1 week ago

Hello @eedduuar @Samreay, the recommendation is to use Spark, Trino, or another engine that provides support. There is ongoing work on expiring snapshots, but there is no ETA yet.

We are open to receiving help, and you can track the progress through issue #1065.