pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
36.92k stars 5.81k forks source link

planner: sort merge join is missed when two tables are in desc order #13167

Open fzhedu opened 4 years ago

fzhedu commented 4 years ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

  1. What did you do? If possible, provide a recipe for reproducing the error.
    create table t1 (c1 int primary key, c2 int);
    insert into t1 values (1,1),(2,3);
    create table t2 (c1 int primary key, c2 int);
    insert into t2 values (1,1),(2,3),(3,2);
  2. What did you expect to see?
    
    mysql> explain select /*+ TIDB_SMJ(t3,t4) */ * from (select * from t1 order by c1) as t3 join (select * from t2 order by c1)as t4 on t3.c1=t4.c1;
    +------------------------+----------+-----------+------------------------------------------------------------+
    | id                     | count    | task      | operator info                                              |
    +------------------------+----------+-----------+------------------------------------------------------------+
    | MergeJoin_11           | 2.50     | root      | inner join, left key:Column#1, right key:Column#5          |
    | ├─TableReader_19       | 2.00     | root      | data:TableScan_18                                          |
    | │ └─TableScan_18       | 2.00     | cop[tikv] | table:t1, range:[-inf,+inf], keep order:true, stats:pseudo |
    | └─TableReader_25       | 10000.00 | root      | data:TableScan_24                                          |
    |   └─TableScan_24       | 10000.00 | cop[tikv] | table:t2, range:[-inf,+inf], keep order:true, stats:pseudo |
    +------------------------+----------+-----------+------------------------------------------------------------+
    5 rows in set, 1 warning (0.00 sec)

mysql> explain select /+ TIDB_SMJ(t3,t4) / from (select from t1 order by c1 desc) as t3 join (select * from t2 order by c1 desc)as t4 on t3.c1=t4.c1; This result should keek the same as the above.

3. What did you see instead?

mysql> explain select /+ TIDB_SMJ(t3,t4) / from (select from t1 order by c1 desc) as t3 join (select * from t2 order by c1 desc)as t4 on t3.c1=t4.c1; +------------------------+----------+-----------+------------------------------------------------------------------+ | id | count | task | operator info | +------------------------+----------+-----------+------------------------------------------------------------------+ | HashRightJoin_13 | 2.50 | root | inner join, inner:TableReader_19, equal:[eq(Column#1, Column#5)] | | ├─TableReader_19 | 2.00 | root | data:TableScan_18 | | │ └─TableScan_18 | 2.00 | cop[tikv] | table:t1, range:[-inf,+inf], keep order:true, desc, stats:pseudo | | └─TableReader_25 | 10000.00 | root | data:TableScan_24 | | └─TableScan_24 | 10000.00 | cop[tikv] | table:t2, range:[-inf,+inf], keep order:true, desc, stats:pseudo | +------------------------+----------+-----------+------------------------------------------------------------------+ 5 rows in set, 1 warning (0.00 sec)



4. What version of TiDB are you using (`tidb-server -V` or run `select tidb_version();` on TiDB)?
Release Version: v4.0.0-alpha-672-g87acbcddf
ghost commented 4 years ago

The behavior has changed slightly from described - but the issue exists. SMJ will apply to descending order, but it requires a sort operation (versus just reading the rows in reverse order). Here is an example from master:

drop table if exists t1,t2;
create table t1 (c1 int primary key, c2 int);
insert into t1 values (1,1),(2,3);
create table t2 (c1 int primary key, c2 int);
insert into t2 values (1,1),(2,3),(3,2);
explain select * from (select * from t1 order by c1) as t3 join (select * from t2 order by c1)as t4 on t3.c1=t4.c1;
explain select * from (select * from t1 order by c1 desc) as t3 join (select * from t2 order by c1 desc)as t4 on t3.c1=t4.c1;
explain select /*+ TIDB_SMJ(t3,t4) */ * from (select * from t1 order by c1) as t3 join (select * from t2 order by c1)as t4 on t3.c1=t4.c1;
explain select /*+ TIDB_SMJ(t3,t4) */ * from (select * from t1 order by c1 desc) as t3 join (select * from t2 order by c1 desc)as t4 on t3.c1=t4.c1;

..

mysql> explain select * from (select * from t1 order by c1) as t3 join (select * from t2 order by c1)as t4 on t3.c1=t4.c1;
+-----------------------------+----------+-----------+---------------+-------------------------------------------------------+
| id                          | estRows  | task      | access object | operator info                                         |
+-----------------------------+----------+-----------+---------------+-------------------------------------------------------+
| MergeJoin_11                | 12500.00 | root      |               | inner join, left key:test.t1.c1, right key:test.t2.c1 |
| ├─TableReader_25(Build)     | 10000.00 | root      |               | data:TableFullScan_24                                 |
| │ └─TableFullScan_24        | 10000.00 | cop[tikv] | table:t2      | keep order:true, stats:pseudo                         |
| └─TableReader_19(Probe)     | 10000.00 | root      |               | data:TableFullScan_18                                 |
|   └─TableFullScan_18        | 10000.00 | cop[tikv] | table:t1      | keep order:true, stats:pseudo                         |
+-----------------------------+----------+-----------+---------------+-------------------------------------------------------+
5 rows in set (0.00 sec)

mysql> explain select * from (select * from t1 order by c1 desc) as t3 join (select * from t2 order by c1 desc)as t4 on t3.c1=t4.c1;
+-----------------------------+----------+-----------+---------------+------------------------------------------------+
| id                          | estRows  | task      | access object | operator info                                  |
+-----------------------------+----------+-----------+---------------+------------------------------------------------+
| HashJoin_12                 | 12500.00 | root      |               | inner join, equal:[eq(test.t1.c1, test.t2.c1)] |
| ├─TableReader_25(Build)     | 10000.00 | root      |               | data:TableFullScan_24                          |
| │ └─TableFullScan_24        | 10000.00 | cop[tikv] | table:t2      | keep order:true, desc, stats:pseudo            |
| └─TableReader_19(Probe)     | 10000.00 | root      |               | data:TableFullScan_18                          |
|   └─TableFullScan_18        | 10000.00 | cop[tikv] | table:t1      | keep order:true, desc, stats:pseudo            |
+-----------------------------+----------+-----------+---------------+------------------------------------------------+
5 rows in set (0.00 sec)

mysql> explain select /*+ TIDB_SMJ(t3,t4) */ * from (select * from t1 order by c1) as t3 join (select * from t2 order by c1)as t4 on t3.c1=t4.c1;
+-----------------------------+----------+-----------+---------------+-------------------------------------------------------+
| id                          | estRows  | task      | access object | operator info                                         |
+-----------------------------+----------+-----------+---------------+-------------------------------------------------------+
| MergeJoin_10                | 12500.00 | root      |               | inner join, left key:test.t1.c1, right key:test.t2.c1 |
| ├─TableReader_23(Build)     | 10000.00 | root      |               | data:TableFullScan_22                                 |
| │ └─TableFullScan_22        | 10000.00 | cop[tikv] | table:t2      | keep order:true, stats:pseudo                         |
| └─TableReader_17(Probe)     | 10000.00 | root      |               | data:TableFullScan_16                                 |
|   └─TableFullScan_16        | 10000.00 | cop[tikv] | table:t1      | keep order:true, stats:pseudo                         |
+-----------------------------+----------+-----------+---------------+-------------------------------------------------------+
5 rows in set (0.00 sec)

mysql> explain select /*+ TIDB_SMJ(t3,t4) */ * from (select * from t1 order by c1 desc) as t3 join (select * from t2 order by c1 desc)as t4 on t3.c1=t4.c1;
+------------------------------+----------+-----------+---------------+-------------------------------------------------------+
| id                           | estRows  | task      | access object | operator info                                         |
+------------------------------+----------+-----------+---------------+-------------------------------------------------------+
| MergeJoin_11                 | 12500.00 | root      |               | inner join, left key:test.t1.c1, right key:test.t2.c1 |
| ├─Sort_27(Build)             | 10000.00 | root      |               | test.t2.c1                                            |
| │ └─TableReader_26           | 10000.00 | root      |               | data:TableFullScan_25                                 |
| │   └─TableFullScan_25       | 10000.00 | cop[tikv] | table:t2      | keep order:true, desc, stats:pseudo                   |
| └─Sort_19(Probe)             | 10000.00 | root      |               | test.t1.c1                                            |
|   └─TableReader_18           | 10000.00 | root      |               | data:TableFullScan_17                                 |
|     └─TableFullScan_17       | 10000.00 | cop[tikv] | table:t1      | keep order:true, desc, stats:pseudo                   |
+------------------------------+----------+-----------+---------------+-------------------------------------------------------+
7 rows in set (0.00 sec)

mysql> SELECT tidb_version()\G
*************************** 1. row ***************************
tidb_version(): Release Version: v4.0.0-beta.2-771-gca41972fb
Edition: Community
Git Commit Hash: ca41972fbac068c8a5de107d9075f09ac68842ac
Git Branch: master
UTC Build Time: 2020-07-14 02:41:21
GoVersion: go1.13
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false
1 row in set (0.00 sec)
yudongusa commented 3 years ago

@fzhedu is there any update on this issue? It seems being awhile since it was assigned. Please free free to let me know if you'd prefer continue working on this or assign to someone from planner team.

chrysan commented 2 years ago

When generating request properties for children, we don't provide alternatives, this part could be smarter: https://github.com/pingcap/tidb/blob/master/planner/core/exhaust_physical_plans.go#L112