trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.49k stars 3.02k forks source link

Support GROUP BY predicate pushdown when using json_extract in aggregation #23996

Closed tontinton closed 2 days ago

tontinton commented 3 weeks ago

The following queries don't hit my connector's applyAggregation breakpoint:

SELECT max(timestamp) FROM catalog.default.table GROUP BY json_extract(some_str, '$.some_number');

SELECT max(json_extract(some_str, '$.some_number')) FROM catalog.default.table GROUP BY timestamp;
hashhar commented 2 days ago

For the 2nd query the reason is because if you look at the plan you'll notice some plan node between the aggregation and the table scan.

And the aggregation pushdown rule requires no intermediate plan nodes to be present.

For the first example it's not going to happen anyway because the grouping key is not a column reference.

hashhar commented 2 days ago

I'm closing this as duplicate.

Duplicates https://github.com/trinodb/trino/issues/4171.