Open osscm opened 2 years ago
cc @findepi @RussellSpitzer
This probably would map to $all_manifest_entries
.
@osscm what would be the use-case?
@findepi
All_Entries
exposes all the entries of Manifest file, including the valid and deleted data files in the table manifests.
All_Datafiles
exposes valid data files in the table manifests.
So, I feel these tables are very handy... these tables can typically be used to understand the data laid out in the iceberg table, including size, location. Its also handy when user is trying to understand the split planning and parallelism. And in cases if query does not return data, which it should be, then can find individual data files and even validate it manually.
Spark supports iceberg's All_* metadata tables. This issue is to add the
All_Entries
metadata table. We can create separate issues for different tables.reference: https://github.com/apache/iceberg/blob/e146d812f251f1ee5b54edd7dc696034c5ff75f4/core/src/main/java/org/apache/iceberg/MetadataTableUtils.java#L71
Also wondering if we can also think of reusing
Metadata table
classes that Iceberg has like: AllEntries instead of doing it doing it in the Trino APIs.