risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
7.06k stars 581 forks source link

feat(optimizer): support better predicate pushdown for table scan #19525

Open chenzl25 opened 20 hours ago

chenzl25 commented 20 hours ago

Is your feature request related to a problem? Please describe.

If a predicate contains or, currently our optimizer failed to push them to storage which leads to poor performance. We can improve the performance by pushing predicate to storage and keep the original predicate on top of the scan. PS: please note that we need to ensure each arm of the or condition is non-overlap.

create table t(id int primary key, name varchar);
-- We should push predicate id = 1 or id = 2 to storage.
explain select * from t where id = 1 or (id = 2 and name = 'x');

                                            QUERY PLAN
--------------------------------------------------------------------------------------------------
 BatchExchange { order: [], dist: Single }
 └─BatchFilter { predicate: ((t.id = 1:Int32) OR ((t.id = 2:Int32) AND (t.name = 'x':Varchar))) }
   └─BatchScan { table: t, columns: [id, name] }
(3 rows)

-- We should push predicate id > 10001 or id = 10000 to storage.
select * from t where id > 10001 or (id = 10000 and name = 'x');
----------------------------------------------------------------------------------------------------------
 BatchExchange { order: [], dist: Single }
 └─BatchFilter { predicate: ((t.id > 10001:Int32) OR ((t.id = 10000:Int32) AND (t.name = 'x':Varchar))) }
   └─BatchScan { table: t, columns: [id, name] }
(3 rows)

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response