apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.3k stars 3.21k forks source link

[Enhancement] multi left join do not push down #39740

Open cambyzju opened 3 weeks ago

cambyzju commented 3 weeks ago

Search before asking

Description

Version: doris-2.1.5-rc02-a04583c9a1

Step 1: create three tables with same schema, and insert some example data:

CREATE TABLE `a` (
  `k1` INT NOT NULL,
  `k2` INT NOT NULL,
  `v` INT NULL
) ENGINE=OLAP
DUPLICATE KEY(`k1`, `k2`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`k1`) BUCKETS 1
PROPERTIES ("replication_num"="1");

CREATE TABLE `b` (
  `k1` INT NOT NULL,
  `k2` INT NOT NULL,
  `v` INT NULL
) ENGINE=OLAP
DUPLICATE KEY(`k1`, `k2`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`k1`) BUCKETS 1
PROPERTIES ("replication_num"="1");

CREATE TABLE `c` (
  `k1` INT NOT NULL,
  `k2` INT NOT NULL,
  `v` INT NULL
) ENGINE=OLAP
DUPLICATE KEY(`k1`, `k2`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`k1`) BUCKETS 1
PROPERTIES ("replication_num"="1");

INSERT INTO a VALUES(1,2,3),(4,5,6);
INSERT INTO b VALUES(1,2,3),(4,5,6);
INSERT INTO c VALUES(1,2,3),(4,5,6);

Step 2: Use one left join like bellow, we found y.k1=1 generated and pushed down to table b

> explain shape plan select * from a as x left join b as y on x.k1 = y.k1 where y.k1=1;
+--------------------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                                              |
+--------------------------------------------------------------------------------------------------------------+
| PhysicalResultSink                                                                                           |
| --PhysicalDistribute[DistributionSpecGather]                                                                 |
| ----hashJoin[INNER_JOIN] hashCondition=((x.k1 = y.k1)) otherCondition=() build RFs:RF0 k1->[k1];RF1 k1->[k1] |
| ------filter((x.k1 = 1))                                                                                     |
| --------PhysicalOlapScan[a] apply RFs: RF0 RF1                                                               |
| ------PhysicalDistribute[DistributionSpecHash]                                                               |
| --------filter((y.k1 = 1))                                                                                   |
| ----------PhysicalOlapScan[b]                                                                                |
+--------------------------------------------------------------------------------------------------------------+

Problem 1: But if we replace table y with another left join, we found y.k1=1 only pushed to join node, not pushed to table b and c

> explain shape plan select * from a as x left join (select c.k1 as k1 from b left join c on b.k1=c.k1) as y on x.k1 = y.k1 where x.k1=1;
+-------------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                     |
+-------------------------------------------------------------------------------------+
| PhysicalResultSink                                                                  |
| --PhysicalDistribute[DistributionSpecGather]                                        |
| ----hashJoin[LEFT_OUTER_JOIN] hashCondition=((x.k1 = y.k1)) otherCondition=()       |
| ------filter((x.k1 = 1))                                                            |
| --------PhysicalOlapScan[a]                                                         |
| ------PhysicalDistribute[DistributionSpecHash]                                      |
| --------PhysicalProject                                                             |
| ----------filter((y.k1 = 1))                                                        |
| ------------hashJoin[LEFT_OUTER_JOIN] hashCondition=((b.k1 = k1)) otherCondition=() |
| --------------PhysicalProject                                                       |
| ----------------PhysicalOlapScan[b]                                                 |
| --------------PhysicalDistribute[DistributionSpecHash]                              |
| ----------------PhysicalProject                                                     |
| ------------------PhysicalOlapScan[c]                                               |
+-------------------------------------------------------------------------------------+

Problem 2: If left table is come from UNION, the predict do not push down to table c

> explain shape plan with x as(select * from a union select * from b) select * from x left join c on x.k1=c.k1 where x.k1=1;
+-------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                               |
+-------------------------------------------------------------------------------+
| PhysicalResultSink                                                            |
| --PhysicalDistribute[DistributionSpecGather]                                  |
| ----hashJoin[LEFT_OUTER_JOIN] hashCondition=((x.k1 = c.k1)) otherCondition=() |
| ------hashAgg[GLOBAL]                                                         |
| --------PhysicalDistribute[DistributionSpecHash]                              |
| ----------hashAgg[LOCAL]                                                      |
| ------------PhysicalUnion                                                     |
| --------------PhysicalDistribute[DistributionSpecExecutionAny]                |
| ----------------PhysicalProject                                               |
| ------------------filter((a.k1 = 1))                                          |
| --------------------PhysicalOlapScan[a]                                       |
| --------------PhysicalDistribute[DistributionSpecExecutionAny]                |
| ----------------PhysicalProject                                               |
| ------------------filter((b.k1 = 1))                                          |
| --------------------PhysicalOlapScan[b]                                       |
| ------PhysicalDistribute[DistributionSpecReplicated]                          |
| --------PhysicalOlapScan[c]                                                   |
+-------------------------------------------------------------------------------+

Solution

No response

Are you willing to submit PR?

Code of Conduct

cambyzju commented 2 weeks ago

Problem2 will fixed after:https://github.com/apache/doris/pull/39450/