sundy-li / databend

FuseQuery is a Distributed SQL Query Engine at scale
Other
2 stars 0 forks source link

bug: Join filter pushdown incomplete #89

Open sundy-li opened 2 weeks ago

sundy-li commented 2 weeks ago

Summary

https://github.com/sundy-li/databend/blob/master/tests/sqllogictests/suites/mode/standalone/explain/filter.test#L48-L67

filter a>1 AND a IS NOT should be pushed down into both tables t1 and t.

duckdb is correct in this query, and it's result examples:

D insert into t select number from  range(50) t(number);
D insert into t1 select number from  range(50) t(number);
D explain select t1.a, t2.action as action from (select a,'INSERT' as action from t where a>1) t2 left join t1 on t2.a=t1.a where action >= 'INSERT';

┌─────────────────────────────┐
│┌───────────────────────────┐│
││       Physical Plan       ││
│└───────────────────────────┘│
└─────────────────────────────┘
┌───────────────────────────┐
│         PROJECTION        │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│             a             │
│           action          │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│         HASH_JOIN         │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│            LEFT           │
│           a = a           ├──────────────┐
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │              │
│           EC: 10          │              │
└─────────────┬─────────────┘              │
┌─────────────┴─────────────┐┌─────────────┴─────────────┐
│         PROJECTION        ││         SEQ_SCAN          │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   ││   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│             a             ││             t1            │
│           action          ││   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│                           ││             a             │
│                           ││   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│                           ││ Filters: a>1 AND a IS NOT │
│                           ││            NULL           │
│                           ││   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│                           ││           EC: 10          │
└─────────────┬─────────────┘└───────────────────────────┘
┌─────────────┴─────────────┐
│         SEQ_SCAN          │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│             t             │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│             a             │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│ Filters: a>1 AND a IS NOT │
│            NULL           │
│   ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
│           EC: 10          │
└───────────────────────────┘
dosubot[bot] commented 2 weeks ago

The filter a > 1 AND a IS NOT NULL is not being pushed down into both tables t1 and t in the Databend query execution plan because the current implementation applies the filters after the join operation, not during the table scan phase. This behavior is evident from the provided execution plans, which show that the filters are applied post-join.

For example, in the query:

explain select * from t left join t t1 on t.a = t1.a where t1.a <= 1 or (t.a > 1 and t1.a > 1)

The execution plan shows:

Filter
├── output columns: [t.a (#0), t1.a (#1)]
├── filters: [is_true(t1.a (#1) <= 1 OR t.a (#0) > 1 AND t1.a (#1) > 1)]
├── estimated rows: 6.13
└── HashJoin
    ├── output columns: [t.a (#0), t1.a (#1)]
    ├── join type: INNER
    ├── build keys: [t1.a (#1)]
    ├── probe keys: [t.a (#0)]
    ├── filters: []
    ├── estimated rows: 7.47
    ├── Filter(Build)
    │   ├── output columns: [t1.a (#1)]
    │   ├── filters: [is_true(t1.a (#1) <= 1 OR t1.a (#1) > 1)]
    │   ├── estimated rows: 8.20
    │   └── TableScan
    │       ├── table: default.eliminate_outer_join.t
    │       ├── output columns: [a (#1)]
    │       ├── read rows: 10
    │       ├── read size: < 1 KiB
    │       ├── partitions total: 1
    │       ├── partitions scanned: 1
    │       ├── pruning stats: [segments: <range pruning: 1 to 1>, blocks: <range pruning: 1 to 1>]
    │       ├── push downs: [filters: [is_true(t.a (#1) <= 1 OR t.a (#1) > 1)], limit: NONE]
    │       └── estimated rows: 10.00
    └── Filter(Probe)
        ├── output columns: [t.a (#0)]
        ├── filters: [is_true(t.a (#0) <= 1 OR t.a (#0) > 1)]
        ├── estimated rows: 8.20
        └── TableScan
            ├── table: default.eliminate_outer_join.t
            ├── output columns: [a (#0)]
            ├── read rows: 10
            ├── read size: < 1 KiB
            ├── partitions total: 1
            ├── partitions scanned: 1
            ├── pruning stats: [segments: <range pruning: 1 to 1>, blocks: <range pruning: 1 to 1>]
            ├── push downs: [filters: [is_true(t.a (#0) <= 1 OR t.a (#0) > 1)], limit: NONE]
            └── estimated rows: 10.00

The filters are applied after the join operation, and the push downs during the table scan phase do not include the filter a > 1 AND a IS NOT NULL [1].

The logic for filter pushdown is implemented in several files, including rule_push_down_filter_join.rs, rule_push_down_filter_project_set.rs, dphyp.rs, and physical_table_scan.rs [2][3][4][5].

There are also existing unit tests that cover filter pushdown functionality, such as those in push_down_filter_eval_scalar.test and infer_filter.test [6][7][8].

However, there are known limitations and TODOs related to filter pushdown, such as the need to fix limit pushdown during tests [9][4].

To continue talking to Dosu, mention @dosu.