Open KingLommel opened 7 months ago
@KingLommel : Did you try querying actual table data? Does that work? (Edit: I that this works)
AFAIK, <table-name>.all_data_files
is a synthetic table produced by Iceberg on-the-fly. It does not actually represent the data in the iceberg table itself. It is unfortunate that this information cannot be retrieved after a Nessie GC. We'll look into this).
Issue description
What is the problem:
After a successful nessie-gc run, the iceberg tables in https://iceberg.apache.org/docs/nightly/spark-queries/#all-metadata-tables are corrupted.
What did I do:
For my tests:
What I expect:
I would expect the gc-tool to take care of all metadata in https://iceberg.apache.org/docs/nightly/spark-queries/#all-metadata-tables. I also would expect the number of snapshots from https://iceberg.apache.org/docs/nightly/spark-queries/#snapshots to be reduced. But the number of snapshots is the same as before running nessie-gc
Versions: