pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
36.97k stars 5.82k forks source link

range columns partition with like can't be pruning #32026

Open aytrack opened 2 years ago

aytrack commented 2 years ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

create table t2 (a varchar(20), b blob, index (a(5))) partition by range columns (a) (
    partition p0 values less than ('aaaaa'),
    partition p1 values less than ('bbbbb'),
    partition p2 values less than ('ccccc')
);

insert into t2 values ('aaaaab', '0'), ('bbbbba', 1);
select * from t2 where a like 'aaaaa%';
explain select * from t2 where a like 'aaaaa%';
select * from t2 where a like 'bbbbb%';
explain select * from t2 where a like 'bbbbb%';

2. What did you expect to see? (Required)

explain select * from t2 where a like 'aaaaa%'; using partition p1; explain select * from t2 where a like 'bbbbb%'; using partition p2;

3. What did you see instead (Required)

MySQL root@127.0.0.1:test> explain select * from t2 where a like 'aaaaa%';
                       ->  explain select * from t2 where a like 'bbbbb%';
+---------------------+---------+-----------+---------------+--------------------------------+
| id                  | estRows | task      | access object | operator info                  |
+---------------------+---------+-----------+---------------+--------------------------------+
| TableReader_7       | 0.05    | root      | partition:all | data:Selection_6               |
| └─Selection_6       | 0.05    | cop[tikv] |               | like(test.t2.a, "aaaaa%", 92)  |
|   └─TableFullScan_5 | 2.00    | cop[tikv] | table:t2      | keep order:false, stats:pseudo |
+---------------------+---------+-----------+---------------+--------------------------------+

3 rows in set
Time: 0.006s

+---------------------+---------+-----------+---------------+--------------------------------+
| id                  | estRows | task      | access object | operator info                  |
+---------------------+---------+-----------+---------------+--------------------------------+
| TableReader_7       | 0.05    | root      | partition:all | data:Selection_6               |
| └─Selection_6       | 0.05    | cop[tikv] |               | like(test.t2.a, "bbbbb%", 92)  |
|   └─TableFullScan_5 | 2.00    | cop[tikv] | table:t2      | keep order:false, stats:pseudo |
+---------------------+---------+-----------+---------------+--------------------------------+

4. What is your TiDB version? (Required)

MySQL root@127.0.0.1:test> select tidb_version()\G
***************************[ 1. row ]***************************
tidb_version() | Release Version: v5.5.0-alpha-234-g19a020c59-dirty
Edition: Community
Git Commit Hash: 19a020c59956d1f430c41ca9d168a7417dc33490
Git Branch: master
UTC Build Time: 2022-01-27 03:45:36
GoVersion: go1.17.2
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false
mjonss commented 2 years ago

MySQL supports this, so we should plan for adding support for this too.

And internally it seems like it should be possible to add support, since like 'aaaa%' is already rewritten as range ('aaaaa','aaaab'):

tidb> explain select * from t2 where a like 'aaaaa%';
+-------------------------------+---------+-----------+----------------------+---------------------------------------------------------+
| id                            | estRows | task      | access object        | operator info                                           |
+-------------------------------+---------+-----------+----------------------+---------------------------------------------------------+
| IndexLookUp_11                | 250.00  | root      | partition:all        |                                                         |
| ├─IndexRangeScan_8(Build)     | 250.00  | cop[tikv] | table:t2, index:a(a) | range:["aaaaa","aaaab"), keep order:false, stats:pseudo |
| └─Selection_10(Probe)         | 250.00  | cop[tikv] |                      | like(test.t2.a, "aaaaa%", 92)                           |
|   └─TableRowIDScan_9          | 250.00  | cop[tikv] | table:t2             | keep order:false, stats:pseudo                          |
+-------------------------------+---------+-----------+----------------------+---------------------------------------------------------+
tidb> explain select * from t2 where a between "aaaaa" and "aaaab";
+-------------------------------+---------+-----------+----------------------+---------------------------------------------------------+
| id                            | estRows | task      | access object        | operator info                                           |
+-------------------------------+---------+-----------+----------------------+---------------------------------------------------------+
| IndexLookUp_11                | 250.00  | root      | partition:p1         |                                                         |
| ├─IndexRangeScan_8(Build)     | 250.00  | cop[tikv] | table:t2, index:a(a) | range:["aaaaa","aaaab"], keep order:false, stats:pseudo |
| └─Selection_10(Probe)         | 250.00  | cop[tikv] |                      | ge(test.t2.a, "aaaaa"), le(test.t2.a, "aaaab")          |
|   └─TableRowIDScan_9          | 250.00  | cop[tikv] | table:t2             | keep order:false, stats:pseudo                          |
+-------------------------------+---------+-----------+----------------------+---------------------------------------------------------+
4 rows in set (0,00 sec)

But we also need to check collation, so the like operator does works with case insensitive matches as well.