trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.27k stars 2.95k forks source link

Use Iceberg's Metadata table API's for `ManifestsTable` #12647

Open osscm opened 2 years ago

osscm commented 2 years ago

Use Iceberg Metadata table APIs while implementing Iceberg Metadata tables in Trino. Right now in Trino, we are using the underlying iceberg APIs like icebergTable.newScan() for $files and snapshot.allManifiests() to implement $manifests metadata table.

Instead of the underlying APIs, we can use the org.apache.iceberg.MetadataTableUtils.createMetadataTableInstance to get the relevant metadata table object and then scan this table. This way we will get benefit from the underlying Iceberg metadata table implementation, and any change will be reflected in Trino as well, so less regression. Spark engine is also using similar metadata table implementations/API to support Iceberg's metadata tables, reference.

related issue: https://github.com/trinodb/trino/issues/11172

osscm commented 2 years ago

cc @findinpath @RussellSpitzer