pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
36.88k stars 5.8k forks source link

planner: IndexJoin's inner TableScan's range should not be decided by columns in `OtherCondition` when EXPLAIN #14822

Closed francis0407 closed 4 years ago

francis0407 commented 4 years ago

Feature Request

Is your feature request related to a problem? Please describe:

mysql> explain select /*+ TIDB_INLJ(t2)*/ * from t1, t2 where t1.a = t2.a and t1.b = t2.b;
+--------------------------+-------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
| id                       | count | task      | operator info                                                                                                                  |
+--------------------------+-------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
| IndexJoin_10             | 1.25  | root      | inner join, inner:TableReader_9, outer key:jointest.t1.a, inner key:jointest.t2.a, other cond:eq(jointest.t1.b, jointest.t2.b) |
| ├─TableReader_20         | 1.00  | root      | data:Selection_19                                                                                                              |
| │ └─Selection_19         | 1.00  | cop[tikv] | not(isnull(jointest.t1.a)), not(isnull(jointest.t1.b))                                                                         |
| │   └─TableScan_18       | 1.00  | cop[tikv] | table:t1, range:[-inf,+inf], keep order:false, stats:pseudo                                                                    |
| └─TableReader_9          | 1.00  | root      | data:Selection_8                                                                                                               |
|   └─Selection_8          | 1.00  | cop[tikv] | not(isnull(jointest.t2.b))                                                                                                     |
|     └─TableScan_7        | 1.00  | cop[tikv] | table:t2, range: decided by [jointest.t1.a jointest.t1.b], keep order:false, stats:pseudo                                      |
+--------------------------+-------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
7 rows in set (0.01 sec)

Describe the feature you'd like:

In the query above, the inner side TableScan is only affected by t1.a. But the EXPLAIN shows that it is decided by both t1.a and t1.b. So I suggest the explain result should be:

mysql> explain select /*+ TIDB_INLJ(t2)*/ * from t1, t2 where t1.a = t2.a and t1.b = t2.b;
+--------------------------+-------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
| id                       | count | task      | operator info                                                                                                                  |
+--------------------------+-------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
| IndexJoin_10             | 1.25  | root      | inner join, inner:TableReader_9, outer key:jointest.t1.a, inner key:jointest.t2.a, other cond:eq(jointest.t1.b, jointest.t2.b) |
| ├─TableReader_20         | 1.00  | root      | data:Selection_19                                                                                                              |
| │ └─Selection_19         | 1.00  | cop[tikv] | not(isnull(jointest.t1.a)), not(isnull(jointest.t1.b))                                                                         |
| │   └─TableScan_18       | 1.00  | cop[tikv] | table:t1, range:[-inf,+inf], keep order:false, stats:pseudo                                                                    |
| └─TableReader_9          | 1.00  | root      | data:Selection_8                                                                                                               |
|   └─Selection_8          | 1.00  | cop[tikv] | not(isnull(jointest.t2.b))                                                                                                     |
|     └─TableScan_7        | 1.00  | cop[tikv] | table:t2, range: decided by [jointest.t1.a], keep order:false, stats:pseudo                                      |
+--------------------------+-------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
7 rows in set (0.01 sec)

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:

francis0407 commented 4 years ago

I think this could be part of #14603.

gauss1314 commented 4 years ago

Let me fix this.

francis0407 commented 4 years ago

I just found that I make some mistake of the issue. Actually IndexJoin's inner TableScan's range could be decided by more than one columns. For example:

mysql> explain select /*+TIDB_INLJ(t2)*/ * from t1, t2 where t1.a = t2.a and t1.b = t2.a;
+--------------------------+-------+-----------+-----------------------------------------------------------------------------------------------------------------+
| id                       | count | task      | operator info                                                                                                   |
+--------------------------+-------+-----------+-----------------------------------------------------------------------------------------------------------------+
| IndexJoin_9              | 1.25  | root      | inner join, inner:TableReader_8, outer key:jointest.t1.a, jointest.t1.b, inner key:jointest.t2.a, jointest.t2.a |
| ├─TableReader_18         | 1.00  | root      | data:Selection_17                                                                                               |
| │ └─Selection_17         | 1.00  | cop[tikv] | not(isnull(jointest.t1.a)), not(isnull(jointest.t1.b))                                                          |
| │   └─TableScan_16       | 1.00  | cop[tikv] | table:t1, range:[-inf,+inf], keep order:false, stats:pseudo                                                     |
| └─TableReader_8          | 1.00  | root      | data:TableScan_7                                                                                                |
|   └─TableScan_7          | 1.00  | cop[tikv] | table:t2, range: decided by [jointest.t1.a jointest.t1.b], keep order:false, stats:pseudo                       |
+--------------------------+-------+-----------+-----------------------------------------------------------------------------------------------------------------+
6 rows in set (0.00 sec)

In the query above, both t1.a and t1.b have an equal condition with the primary key t2.a. So that TableScan_7 could be decide by these two columns. But if an EqualCondition will be moved to OtherCondition( in other words, it does not contain the inner side primary key), the outer join keys should not be in range decided by:[...], like the example in the PR discription:

mysql> explain select /*+ TIDB_INLJ(t2)*/ * from t1, t2 where t1.a = t2.a and t1.b = t2.b;
+--------------------------+-------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
| id                       | count | task      | operator info                                                                                                                  |
+--------------------------+-------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
| IndexJoin_10             | 1.25  | root      | inner join, inner:TableReader_9, outer key:jointest.t1.a, inner key:jointest.t2.a, other cond:eq(jointest.t1.b, jointest.t2.b) |
| ├─TableReader_20         | 1.00  | root      | data:Selection_19                                                                                                              |
| │ └─Selection_19         | 1.00  | cop[tikv] | not(isnull(jointest.t1.a)), not(isnull(jointest.t1.b))                                                                         |
| │   └─TableScan_18       | 1.00  | cop[tikv] | table:t1, range:[-inf,+inf], keep order:false, stats:pseudo                                                                    |
| └─TableReader_9          | 1.00  | root      | data:Selection_8                                                                                                               |
|   └─Selection_8          | 1.00  | cop[tikv] | not(isnull(jointest.t2.b))                                                                                                     |
|     └─TableScan_7        | 1.00  | cop[tikv] | table:t2, range: decided by [jointest.t1.a jointest.t1.b](Not right here, should only contain t1.a), keep order:false, stats:pseudo                                      |
+--------------------------+-------+-----------+--------------------------------------------------------------------------------------------------------------------------------+
7 rows in set (0.01 sec)