Open PChou opened 3 years ago
BTW, I'm trying to implement the feature recently.
The following simple test is based on an index of more than 40000 records. The difference in query efficiency between the two methods can be figured out.
trino:default> select hostname, avg("values") from elasticsearch.default.slmday60 group by hostname;
hostname | _col1
---------------+-------------------
192.168.21.58 | 4992.663530635401
192.168.21.59 | 4989.727731732876
(2 rows)
Query 20210225_091409_00005_rb8ni, FINISHED, 1 node Splits: 17 total, 17 done (100.00%) 0.53 [2 rows, 0B] [3 rows/s, 0B/s]
trino:default> set session elasticsearch.aggregation_pushdown_enabled=false;
SET SESSION
trino:default> select hostname, avg("values") from elasticsearch.default.slmday60 group by hostname;
hostname | _col1
---------------+-------------------
192.168.21.58 | 4992.663530635401
192.168.21.59 | 4989.727731732876
(2 rows)
Query 20210225_091431_00007_rb8ni, FINISHED, 1 node Splits: 50 total, 50 done (100.00%) 2.80 [42.1K rows, 1.68MB] [15.1K rows/s, 617KB/s]
Hi team,
I am very excited to see that trino supports aggregation pushdown, because few SQL engines currently on the market support this feature. But I found that only a few connectors currently support it. We are trying to create a query platform based on trino. The data source includes elasticsearch, so we hope trino can support the aggregation pushdown of elasticsearch, which will greatly improve performance. Is this in the plan?