trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.54k stars 3.03k forks source link

Support dynamic filtering partition pruning in Cassandra connector #4650

Open skycow00 opened 4 years ago

skycow00 commented 4 years ago

I have a few big size table over than 1 billion rows. Each tables relate with each others. I'd like to join the tables with keys. Such as,

SELECT * FROM big_one o JOIN big_two t ON o.r_key = t.r_key WHERE t.pk = 'AA'

The result of the query has just a few rows. But, my presto engine loaded data of whole table. Two queries which separated 2 steps were so fast.

SELECT * FROM big_two WHERE pk = 'AA'

Then,

SELECT * FROM big_one WHERE r_key = 'RESULT_OF_ROWS'

I think the optimizing about join algorithm is similar with Dynamic Filtering. When I use Cassandra connector, Doesn't it work?

It is my first time to make a question in github. I think my question doesn't have enough information. I hope your understand. Thank you.

findepi commented 4 years ago

DF is a feature that has some common bits and also each connector may opt-in to integrate with it better, in a connector-specific manner. For Cassandra it has not been done yet.

skycow00 commented 4 years ago

Thank to your reply.