pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
36.88k stars 5.8k forks source link

range info for the inner child of index join in the "explain" result is not correct #8058

Closed zz-jason closed 4 years ago

zz-jason commented 5 years ago

Bug Report

drop table if exists t_outer, t_inner;
create table t_outer(a bigint, b bigint);
create table t_inner(a bigint, b bigint, index idx(a, b));

TiDB(localhost:4000) > desc select /*+ TIDB_INLJ(t_outer) */ * from t_outer left join t_inner on t_outer.b = t_inner.b and t_inner.a = 1;
+-----------------------+----------+------+-----------------------------------------------------------------------------------------------+
| id                    | count    | task | operator info                                                                                 |
+-----------------------+----------+------+-----------------------------------------------------------------------------------------------+
| IndexJoin_8           | 10000.00 | root | left outer join, inner:IndexReader_7, outer key:test.t_outer.b, inner key:test.t_inner.b      |
| ├─TableReader_10      | 10000.00 | root | data:TableScan_9                                                                              |
| │ └─TableScan_9       | 10000.00 | cop  | table:t_outer, range:[-inf,+inf], keep order:false, stats:pseudo                              |
| └─IndexReader_7       | 10.00    | root | index:IndexScan_6                                                                             |
|   └─IndexScan_6       | 10.00    | cop  | table:t_inner, index:a, b, range: decided by [test.t_outer.b], keep order:false, stats:pseudo |
+-----------------------+----------+------+-----------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)

For now, the range info for IndexScan_6 is:

range: decided by [test.t_outer.b]

It's not completely correct. It's better to be displayed like:

range: decided by [1, test.t_outer.b]

tidb version:

TiDB(localhost:4000) > select tidb_version()\G
*************************** 1. row ***************************
tidb_version(): Release Version: None
Git Commit Hash: 82c9f2c8a8825c4bb2da46f7418c24b93bb1936d
Git Branch: master
UTC Build Time: 2018-10-25 11:14:44
GoVersion: go version go1.11 darwin/amd64
Race Enabled: false
TiKV Min Version: 2.1.0-alpha.1-ff3dd160846b7d1aed9079c389fc188f7f5ea13e
Check Table Before Drop: false
1 row in set (0.00 sec)

discovered in this issue: https://github.com/pingcap/tidb/issues/8039

haplone commented 5 years ago

How about let me fix it? @zz-jason @eurekaka

zz-jason commented 5 years ago

How about let me fix it? @zz-jason @eurekaka

Sure! Thanks for your contribution!

winoros commented 5 years ago

@haplone You can refer to buildRangeDecidedByInformation in https://github.com/pingcap/tidb/pull/8471. This part of that pr solved this problem.

winoros commented 5 years ago

Also note that, You need to use TIDB_INLJ(t_inner) instead of TIDB_INLJ(t_outer) in master.

jarvys commented 5 years ago

请教个问题,我对这个 TIDB_INLJ 有些疑问。根据文档介绍,TIDB_INLJ 的作用是提示优化器内表的候选表,但是通过上面的例子还有 #8039 中的例子,貌似它的代码逻辑写反了。

zz-jason commented 5 years ago

请教个问题,我对这个 TIDB_INLJ 有些疑问。根据文档介绍,TIDB_INLJ 的作用是提示优化器内表的候选表,但是通过上面的例子还有 #8039 中的例子,貌似它的代码逻辑写反了。

@jarvys What's the version of TiDB that you are using? In old versions, the hint specifies the outer table, not the inner table.

jarvys commented 5 years ago

@zz-jason I got it. In new version TIDB_INLJ hint behavior has been modified. I am using the 2.x version now, but reading the latest documentation, so I am confused.

zz-jason commented 4 years ago

It has been fixed in the latest master branch:

TiDB(root@127.0.0.1:test) > desc select /*+ TIDB_INLJ(t_inner) */ * from t_outer left join t_inner on t_outer.b = t_inner.b and t_inner.a = 1;
+-----------------------------+----------+-----------+-----------------------------------------------------------------------------------------------------------------------------------------+
| id                          | estRows  | task      | operator info                                                                                                                           |
+-----------------------------+----------+-----------+-----------------------------------------------------------------------------------------------------------------------------------------+
| IndexJoin_9                 | 10000.00 | root      | left outer join, inner:IndexReader_8, outer key:test.t_outer.b, inner key:test.t_inner.b                                                |
| ├─TableReader_18(Build)     | 10000.00 | root      | data:TableFullScan_17                                                                                                                   |
| │ └─TableFullScan_17        | 10000.00 | cop[tikv] | table:t_outer, keep order:false, stats:pseudo                                                                                           |
| └─IndexReader_8(Probe)      | 0.01     | root      | index:Selection_7                                                                                                                       |
|   └─Selection_7             | 0.01     | cop[tikv] | not(isnull(test.t_inner.b))                                                                                                             |
|     └─IndexRangeScan_6      | 0.01     | cop[tikv] | table:t_inner, index:a, b, range: decided by [eq(test.t_inner.b, test.t_outer.b) eq(test.t_inner.a, 1)], keep order:false, stats:pseudo |
+-----------------------------+----------+-----------+-----------------------------------------------------------------------------------------------------------------------------------------+
6 rows in set (0.00 sec)