Open kolbe opened 5 years ago
@kolbe Could you please attach the stats of the table trips
?
This appears to be fixed with the cascades planner. I have fast analyze + cascades planner enabled, and imported the sample data set:
mysql> explain analyze select avg(duration) from trips where start_date>='2016-01-01' and start_date<'2017-01-01';
+--------------------------+----------+-----------+------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------+------+
| id | count | task | operator info | execution info | memory | disk |
+--------------------------+----------+-----------+------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------+------+
| HashAgg_15 | 1.00 | root | funcs:avg(Column#12, Column#13)->Column#11 | time:1.841535657s, loops:2, rows:1, PartialConcurrency:4, FinalConcurrency:4 | 14.0625 KB | N/A |
| └─TableReader_16 | 1.00 | root | data:HashAgg_17 | time:1.841406081s, loops:2, rows:29, rpc num: 29, rpc max:1.264943385s, min:354.322958ms, avg:759.777797ms, p80:938.843255ms, p95:1.089895684s, proc keys max:925094, p95:871529 | 474 Bytes | N/A |
| └─HashAgg_17 | 1.00 | cop[tikv] | funcs:avg(bikeshare.trips.duration)->Column#12 | proc max:1s, min:256ms, p80:724ms, p95:872ms, rows:29, iters:18685, tasks:29 | N/A | N/A |
| └─Selection_12 | 8000.00 | cop[tikv] | ge(bikeshare.trips.start_date, 2016-01-01 00:00:00.000000), lt(bikeshare.trips.start_date, 2017-01-01 00:00:00.000000) | proc max:1s, min:236ms, p80:712ms, p95:804ms, rows:3333994, iters:18685, tasks:29 | N/A | N/A |
| └─TableScan_13 | 10000.00 | cop[tikv] | table:trips, range:[-inf,+inf], keep order:false | proc max:908ms, min:208ms, p80:648ms, p95:708ms, rows:19117643, iters:18685, tasks:29 | N/A | N/A |
+--------------------------+----------+-----------+------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------+------+
5 rows in set (1.85 sec)
mysql> ALTER TABLE trips ADD INDEX (start_date);
Query OK, 0 rows affected (2 min 46.68 sec)
mysql> explain analyze select avg(duration) from trips where start_date>='2016-01-01' and start_date<'2017-01-01';
+--------------------------+----------+-----------+------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+------+
| id | count | task | operator info | execution info | memory | disk |
+--------------------------+----------+-----------+------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+------+
| HashAgg_15 | 1.00 | root | funcs:avg(Column#12, Column#13)->Column#11 | time:2.543039021s, loops:2, rows:1, PartialConcurrency:4, FinalConcurrency:4 | 14.8203125 KB | N/A |
| └─TableReader_16 | 1.00 | root | data:HashAgg_17 | time:2.542872937s, loops:2, rows:35, rpc num: 35, rpc max:1.83105932s, min:196.368754ms, avg:938.167875ms, p80:1.146154839s, p95:1.653919286s, proc keys max:1477252, p95:1469882 | 467 Bytes | N/A |
| └─HashAgg_17 | 1.00 | cop[tikv] | funcs:avg(bikeshare.trips.duration)->Column#12 | proc max:1.2s, min:196ms, p80:952ms, p95:1.088s, rows:35, iters:18687, tasks:35 | N/A | N/A |
| └─Selection_12 | 8000.00 | cop[tikv] | ge(bikeshare.trips.start_date, 2016-01-01 00:00:00.000000), lt(bikeshare.trips.start_date, 2017-01-01 00:00:00.000000) | proc max:1.176s, min:196ms, p80:952ms, p95:1.072s, rows:3333994, iters:18687, tasks:35 | N/A | N/A |
| └─TableScan_13 | 10000.00 | cop[tikv] | table:trips, range:[-inf,+inf], keep order:false | proc max:1.06s, min:188ms, p80:884ms, p95:1.012s, rows:19117643, iters:18687, tasks:35 | N/A | N/A |
+--------------------------+----------+-----------+------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+------+
5 rows in set (2.54 sec)
mysql> set tidb_enable_cascades_planner = 0;
Query OK, 0 rows affected (0.00 sec)
mysql> explain analyze select avg(duration) from trips where start_date>='2016-01-01' and start_date<'2017-01-01';
+------------------------+------------+-----------+--------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+------+
| id | count | task | operator info | execution info | memory | disk |
+------------------------+------------+-----------+--------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+------+
| HashAgg_17 | 1.00 | root | funcs:avg(Column#15, Column#16)->Column#11 | time:5.327479477s, loops:2, rows:1, PartialConcurrency:4, FinalConcurrency:4 | 23.6171875 KB | N/A |
| └─IndexLookUp_18 | 1.00 | root | | time:5.327223884s, loops:2, rows:187, rpc num: 4, rpc max:582.530494ms, min:212.205879ms, avg:463.661851ms, p80:582.530494ms, p95:582.530494ms, proc keys max:1166376, p95:1166376 | 30.2153263092041 MB | N/A |
| ├─IndexScan_15 | 3337375.73 | cop[tikv] | table:trips, index:start_date, range:[2016-01-01 00:00:00,2017-01-01 00:00:00), keep order:false | proc max:556ms, min:196ms, p80:556ms, p95:556ms, rows:3333994, iters:3275, tasks:4 | N/A | N/A |
| └─HashAgg_7 | 1.00 | cop[tikv] | funcs:avg(bikeshare.trips.duration)->Column#15 | proc max:220ms, min:4ms, p80:152ms, p95:172ms, rows:187, iters:3422, tasks:187 | N/A | N/A |
| └─TableScan_16 | 3337375.73 | cop[tikv] | table:trips, keep order:false | proc max:216ms, min:4ms, p80:148ms, p95:172ms, rows:3333994, iters:3422, tasks:187 | N/A | N/A |
+------------------------+------------+-----------+--------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------+------+
5 rows in set (5.35 sec)
Confirming that this can still be reproduced in master, but not with the cascades planner. This is using the bikeshare example database, but with the additional index on start_date. I have also disabled co-processor cache to reduce skew.
mysql> ALTER TABLE trips ADD INDEX (start_date);
Query OK, 0 rows affected (5 min 8.05 sec)
...
mysql> analyze table trips;
Query OK, 0 rows affected (15.48 sec)
mysql> explain analyze select avg(duration) from trips where start_date>='2016-01-01' and start_date<'2017-01-01';
+----------------------------------+------------+---------+-----------+-------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+-----------------------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+----------------------------------+------------+---------+-----------+-------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+-----------------------+------+
| HashAgg_17 | 1.00 | 1 | root | | time:3.483734572s, loops:2, PartialConcurrency:5, FinalConcurrency:5 | funcs:avg(Column#15, Column#16)->Column#11 | 27.1328125 KB | N/A |
| └─IndexLookUp_18 | 1.00 | 207 | root | | time:3.483619053s, loops:2, rpc num: 5, rpc max:339.88513ms, min:192.009673ms, avg:246.555191ms, p80:339.88513ms, p95:339.88513ms, proc keys max:960000, p95:960000 | | 27.571990966796875 MB | N/A |
| ├─IndexRangeScan_15(Build) | 3286023.42 | 3333994 | cop[tikv] | table:trips, index:start_date(start_date) | proc max:328ms, min:180ms, p80:328ms, p95:328ms, iters:3278, tasks:5 | range:[2016-01-01 00:00:00,2017-01-01 00:00:00), keep order:false | N/A | N/A |
| └─HashAgg_7(Probe) | 1.00 | 207 | cop[tikv] | | proc max:144ms, min:0s, p80:108ms, p95:116ms, iters:3418, tasks:207 | funcs:count(bikeshare.trips.duration)->Column#15, funcs:sum(bikeshare.trips.duration)->Column#16 | N/A | N/A |
| └─TableRowIDScan_16 | 3286023.42 | 3333994 | cop[tikv] | table:trips | proc max:140ms, min:0s, p80:104ms, p95:116ms, iters:3418, tasks:207 | keep order:false | N/A | N/A |
+----------------------------------+------------+---------+-----------+-------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------+-----------------------+------+
5 rows in set (3.49 sec)
mysql> explain analyze select avg(duration) from trips ignore index (start_date) where start_date>='2016-01-01' and start_date<'2017-01-01';
+------------------------------+-------------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-------------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+------------------------------+-------------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-------------+------+
| StreamAgg_20 | 1.00 | 1 | root | | time:1.661849031s, loops:2 | funcs:avg(Column#14, Column#15)->Column#11 | 2.515625 KB | N/A |
| └─TableReader_21 | 1.00 | 38 | root | | time:1.661828471s, loops:2, rpc num: 38, rpc max:991.591064ms, min:165.664441ms, avg:552.493074ms, p80:721.669958ms, p95:917.433199ms, proc keys max:1398724, p95:1396252 | data:StreamAgg_9 | 401 Bytes | N/A |
| └─StreamAgg_9 | 1.00 | 38 | cop[tikv] | | proc max:780ms, min:148ms, p80:580ms, p95:732ms, iters:18686, tasks:38 | funcs:count(bikeshare.trips.duration)->Column#14, funcs:sum(bikeshare.trips.duration)->Column#15 | N/A | N/A |
| └─Selection_19 | 3286023.42 | 3333994 | cop[tikv] | | proc max:760ms, min:148ms, p80:568ms, p95:692ms, iters:18686, tasks:38 | ge(bikeshare.trips.start_date, 2016-01-01 00:00:00.000000), lt(bikeshare.trips.start_date, 2017-01-01 00:00:00.000000) | N/A | N/A |
| └─TableFullScan_18 | 19117643.00 | 19117643 | cop[tikv] | table:trips | proc max:740ms, min:140ms, p80:544ms, p95:668ms, iters:18686, tasks:38 | keep order:false | N/A | N/A |
+------------------------------+-------------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-------------+------+
5 rows in set (1.66 sec)
mysql> set tidb_enable_cascades_planner=1;
Query OK, 0 rows affected (0.01 sec)
mysql> explain analyze select avg(duration) from trips ignore index (start_date) where start_date>='2016-01-01' and start_date<'2017-01-01';
+------------------------------+-------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+---------------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+------------------------------+-------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+---------------+------+
| HashAgg_17 | 1.00 | 1 | root | | time:1.611738936s, loops:2, PartialConcurrency:5, FinalConcurrency:5 | funcs:avg(Column#12, Column#13)->Column#11 | 18.3359375 KB | N/A |
| └─TableReader_18 | 1.00 | 38 | root | | time:1.611550167s, loops:2, rpc num: 38, rpc max:1.080270293s, min:147.026102ms, avg:536.039127ms, p80:697.547841ms, p95:988.63395ms, proc keys max:1398724, p95:1396252 | data:HashAgg_19 | 401 Bytes | N/A |
| └─HashAgg_19 | 1.00 | 38 | cop[tikv] | | proc max:760ms, min:140ms, p80:564ms, p95:732ms, iters:18686, tasks:38 | funcs:count(bikeshare.trips.duration)->Column#12, funcs:sum(bikeshare.trips.duration)->Column#13 | N/A | N/A |
| └─Selection_15 | 15294114.40 | 3333994 | cop[tikv] | | proc max:740ms, min:140ms, p80:560ms, p95:704ms, iters:18686, tasks:38 | ge(bikeshare.trips.start_date, 2016-01-01 00:00:00.000000), lt(bikeshare.trips.start_date, 2017-01-01 00:00:00.000000) | N/A | N/A |
| └─TableFullScan_16 | 19117643.00 | 19117643 | cop[tikv] | table:trips | proc max:704ms, min:120ms, p80:540ms, p95:640ms, iters:18686, tasks:38 | keep order:false | N/A | N/A |
+------------------------------+-------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+---------------+------+
5 rows in set (1.61 sec)
mysql> explain analyze select avg(duration) from trips where start_date>='2016-01-01' and start_date<'2017-01-01';
+------------------------------+-------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+---------------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+------------------------------+-------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+---------------+------+
| HashAgg_17 | 1.00 | 1 | root | | time:1.595075064s, loops:2, PartialConcurrency:5, FinalConcurrency:5 | funcs:avg(Column#12, Column#13)->Column#11 | 18.3359375 KB | N/A |
| └─TableReader_18 | 1.00 | 38 | root | | time:1.594943344s, loops:2, rpc num: 38, rpc max:1.288971104s, min:176.540044ms, avg:526.175674ms, p80:730.86379ms, p95:897.410466ms, proc keys max:1398724, p95:1396252 | data:HashAgg_19 | 401 Bytes | N/A |
| └─HashAgg_19 | 1.00 | 38 | cop[tikv] | | proc max:744ms, min:136ms, p80:576ms, p95:704ms, iters:18686, tasks:38 | funcs:count(bikeshare.trips.duration)->Column#12, funcs:sum(bikeshare.trips.duration)->Column#13 | N/A | N/A |
| └─Selection_15 | 15294114.40 | 3333994 | cop[tikv] | | proc max:732ms, min:136ms, p80:572ms, p95:688ms, iters:18686, tasks:38 | ge(bikeshare.trips.start_date, 2016-01-01 00:00:00.000000), lt(bikeshare.trips.start_date, 2017-01-01 00:00:00.000000) | N/A | N/A |
| └─TableFullScan_16 | 19117643.00 | 19117643 | cop[tikv] | table:trips | proc max:704ms, min:124ms, p80:548ms, p95:628ms, iters:18686, tasks:38 | keep order:false | N/A | N/A |
+------------------------------+-------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+---------------+------+
5 rows in set (1.60 sec)
mysql> select tidb_version()\G
*************************** 1. row ***************************
tidb_version(): Release Version: v4.0.0-beta.2-807-g824df7675
Edition: Community
Git Commit Hash: 824df767559b8544af0cb71509135d322de1cd47
Git Branch: master
UTC Build Time: 2020-07-20 04:47:45
GoVersion: go1.13
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false
1 row in set (0.00 sec)
Description
Bug Report
Please answer these questions before submitting your issue. Thanks!
TiDB should choose the most efficient query execution plan.
Using the index is about twice as expensive (time) as not using the index.
tidb-server -V
or runselect tidb_version();
on TiDB)?SIG slack channel
#sig-planner
Score
300
Mentor