pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.41k stars 5.85k forks source link

remove the impact of ignored indexes on cardinality estimation #21661

Open qw4990 opened 3 years ago

qw4990 commented 3 years ago

Development Task

create table t1 (a int, b int);
create table t2 (a int, b int, key(a, b));
insert into t1 values (1, 10);
insert into t1 values (2, 9);
insert into t1 values (3, 8);
insert into t1 values (4, 7);
insert into t1 values (5, 6);
insert into t2 values (1, 10);
insert into t2 values (2, 9);
insert into t2 values (3, 8);
insert into t2 values (4, 7);
insert into t2 values (5, 6);
analyze table t1;
analyze table t2;

mysql> explain select * from t1 where a=1 and b=10;
+-------------------------+---------+-----------+---------------+---------------------------------------+
| id                      | estRows | task      | access object | operator info                         |
+-------------------------+---------+-----------+---------------+---------------------------------------+
| TableReader_7           | 0.20    | root      |               | data:Selection_6                      |
| └─Selection_6           | 0.20    | cop[tikv] |               | eq(zipfx.t1.a, 1), eq(zipfx.t1.b, 10) |
|   └─TableFullScan_5     | 5.00    | cop[tikv] | table:t1      | keep order:false                      |
+-------------------------+---------+-----------+---------------+---------------------------------------+

mysql> explain select * from t2 ignore index(a) where a=1 and b=10;
+-------------------------+---------+-----------+---------------+---------------------------------------+
| id                      | estRows | task      | access object | operator info                         |
+-------------------------+---------+-----------+---------------+---------------------------------------+
| TableReader_7           | 1.00    | root      |               | data:Selection_6                      |
| └─Selection_6           | 1.00    | cop[tikv] |               | eq(zipfx.t2.a, 1), eq(zipfx.t2.b, 10) |
|   └─TableFullScan_5     | 5.00    | cop[tikv] | table:t2      | keep order:false                      |
+-------------------------+---------+-----------+---------------+---------------------------------------+

As shown in the above case, although the index (a, b) is ignored in the second SQL, it also affects the estimation rows, since it has a different estRows than the first SQL. How about removing the effect of ignored indexes on estimation? It seems more rational and has no side effects.

In addition, it may be useful when testing the optimizer, for example, we can use the ignore hint to test indexes' impact on estimation.

qw4990 commented 3 years ago

What're your options? @zz-jason @winoros @eurekaka @time-and-fate

morgo commented 3 years ago

Interesting! I can't reproduce this locally (with or without TiKV). I agree it makes sense that ignore should prevent usage for row estimation, but it's not a strong preference. However, if the index is invisible then it should never be used.