trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.49k stars 3.02k forks source link

Add iceberg_bucket UDF to iceberg plugin #24200

Open posulliv opened 1 day ago

posulliv commented 1 day ago

Description

Add a UDF to expose the iceberg bucket transform via SQL. This can be used to help determine what bucket a value would be placed in. This could be used with the optimize procedure to only optimize specific buckets.

Spark has a similar UDF - https://iceberg.apache.org/javadoc/1.5.1/org/apache/iceberg/spark/functions/BucketFunction.html

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required. ( ) Release notes are required. Please propose a release note for me. ( X) Release notes are required, with the following suggested text:

## Section
* Add `iceberg_bucket` UDF to expose the iceberg bucket transform via SQL.
ebyhr commented 1 day ago

@martint Could you please review the syntax?