trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.49k stars 3.02k forks source link

stream function/type to process array elements one by one #24148

Open jinyangli34 opened 5 days ago

jinyangli34 commented 5 days ago

Description

Proposing a new syntax to aggregate array elements in a stream way. Currently, the array is processed as one value unless using CROSS JOIN UNNEST to process array elements individually.

Additional context and related issues

Usage could be

sum(stream(arr))
max(stream(arr))
approx_distinct(stream(arr))
or nested array: max(stream(stream(arr2)))

https://github.com/trinodb/trino/issues/22445

Release notes

( ) This is not user-visible or is docs only, and no release notes are required. ( ) Release notes are required. Please propose a release note for me. ( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`22445`)
cploonker commented 1 day ago

Love the nested array support: max(stream(stream(arr2)))