pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.21k stars 5.84k forks source link

redundant selection in index join inner child #38294

Open xuyifangreeneyes opened 2 years ago

xuyifangreeneyes commented 2 years ago

Enhancement

mysql> create table t1(a int, b int);
Query OK, 0 rows affected (0.02 sec)

mysql> create table t2(a int, b int, c int, primary key (a, b) clustered);
Query OK, 0 rows affected (0.02 sec)

mysql> explain select /*+ inl_join(t2) */ * from t1 join t2 on t1.a = t2.a where t2.b in (10, 20);
+------------------------------+----------+-----------+---------------+-----------------------------------------------------------------------------------------------------------------+
| id                           | estRows  | task      | access object | operator info                                                                                                   |
+------------------------------+----------+-----------+---------------+-----------------------------------------------------------------------------------------------------------------+
| IndexJoin_14                 | 25.00    | root      |               | inner join, inner:TableReader_10, outer key:test.t1.a, inner key:test.t2.a, equal cond:eq(test.t1.a, test.t2.a) |
| ├─TableReader_21(Build)      | 9990.00  | root      |               | data:Selection_20                                                                                               |
| │ └─Selection_20             | 9990.00  | cop[tikv] |               | not(isnull(test.t1.a))                                                                                          |
| │   └─TableFullScan_19       | 10000.00 | cop[tikv] | table:t1      | keep order:false, stats:pseudo                                                                                  |
| └─TableReader_10(Probe)      | 0.00     | root      |               | data:Selection_9                                                                                                |
|   └─Selection_9              | 0.00     | cop[tikv] |               | in(test.t2.b, 10, 20)                                                                                           |
|     └─TableRangeScan_8       | 1.00     | cop[tikv] | table:t2      | range: decided by [eq(test.t2.a, test.t1.a) in(test.t2.b, 10, 20)], keep order:false, stats:pseudo              |
+------------------------------+----------+-----------+---------------+-----------------------------------------------------------------------------------------------------------------+
7 rows in set (0.00 sec)

Selection_9 is redundant because t2.b in (10, 20) is already used to build ranges.

xuyifangreeneyes commented 2 years ago

Besides, estRows of TableRangeScan_8 is 1, which is decided by the following code: https://github.com/pingcap/tidb/blob/51a6684fb767cfd86a8312e0e6447963b927c791/planner/core/exhaust_physical_plans.go#L1018-L1023 However, TableScan as inner child of IndexJoin can return more than 1 tuple for each outer row when it uses CommonHandle(just like the above example).