pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.21k stars 5.84k forks source link

`select` very slow on an empty table from `delete from xx` #52905

Open D3Hunter opened 6 months ago

D3Hunter commented 6 months ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

.... insert data

mysql> delete from t; Query OK, 65536 rows affected (0.38 sec)

mysql> select count(1) from t; +----------+ | count(1) | +----------+ | 0 | +----------+ 1 row in set (0.06 sec)

mysql> show table t regions; +-----------+-----------+--------------------+-----------+-----------------+-------+------------+---------------+------------+----------------------+------------------+------------------------+------------------+ | REGION_ID | START_KEY | END_KEY | LEADER_ID | LEADER_STORE_ID | PEERS | SCATTERING | WRITTEN_BYTES | READ_BYTES | APPROXIMATE_SIZE(MB) | APPROXIMATE_KEYS | SCHEDULING_CONSTRAINTS | SCHEDULING_STATE | +-----------+-----------+--------------------+-----------+-----------------+-------+------------+---------------+------------+----------------------+------------------+------------------------+------------------+ | 14 | t107 | t281474976710649 | 15 | 1 | 15 | 0 | 23118061 | 16450774 | 24 | 133569 | | | +-----------+-----------+--------------------+-----------+-----------------+-------+------------+---------------+------------+----------------------+------------------+------------------------+------------------+

mysql> select now(); +---------------------+ | now() | +---------------------+ | 2024-04-26 11:34:15 | +---------------------+ 1 row in set (0.00 sec)

- at first, i thought it related to MVCC, so wait GC and query again, still very slow
```sql
...... wait gc finished on table t

mysql> select count(1) from t;
+----------+
| count(1) |
+----------+
|        0 |
+----------+
1 row in set (0.07 sec)

mysql> select now();
+---------------------+
| now()               |
+---------------------+
| 2024-04-26 14:18:50 |
+---------------------+
1 row in set (0.00 sec)

mysql> select count(1) from tt; +----------+ | count(1) | +----------+ | 0 | +----------+ 1 row in set (0.01 sec)



<!-- a step by step guide for reproducing the bug. -->

### 2. What did you expect to see? (Required)
very fast to query
### 3. What did you see instead (Required)
slow compared to a new created empty table
### 4. What is your TiDB version? (Required)
master
<!-- Paste the output of SELECT tidb_version() -->
D3Hunter commented 6 months ago

the cause is after GC, the kv is not compacted, so it's still scanned, but skipped, see delete_skipped_count

for tidb_ddl_job, we will insert and delete after process, if we have a lot of ddl, such as create 1M tables, query from it will be slow even it's almost empty.

mysql> trace select count(1) from t;
+----------------------------------------------------------------------+-----------------+-------------+
| operation                                                            | startTS         | duration    |
+----------------------------------------------------------------------+-----------------+-------------+
| trace                                                                | 14:39:08.630002 | 61.555167ms |
|   ├─session.ExecuteStmt                                              | 14:39:08.630402 | 2.309916ms  |
|   │ ├─executor.Compile                                               | 14:39:08.630419 | 284.125µs   |
|   │ └─session.runStmt                                                | 14:39:08.631917 | 571.583µs   |
|   │   └─TableReaderExecutor.Open                                     | 14:39:08.632030 | 338.958µs   |
|   │     └─distsql.Select                                             | 14:39:08.632054 | 34.375µs    |
|   │       └─regionRequest.SendReqCtx                                 | 14:39:08.632165 | 59.293666ms |
|   │         └─rpcClient.SendRequest, region ID: 126, type: Cop       | 14:39:08.632446 | 59.000209ms |
|   │           └─tikv.RPC                                             | 14:39:08.632448 | 58.287417ms |
|   │             ├─tikv.Wait                                          | 14:39:08.632448 | 96.75µs     |
|   │             │ └─tikv.GetSnapshot                                 | 14:39:08.632448 | 50.958µs    |
|   │             └─tikv.Process                                       | 14:39:08.632544 | 58.037041ms |
|   ├─*aggregate.HashAggExec.Next                                      | 14:39:08.632722 | 58.802167ms |
|   │ └─*executor.TableReaderExecutor.Next                             | 14:39:08.632795 | 58.700167ms |
|   └─*aggregate.HashAggExec.Next                                      | 14:39:08.691530 | 875ns       |
+----------------------------------------------------------------------+-----------------+-------------+

mysql> explain analyze select count(1) from t;
+----------------------------+----------+---------+-----------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+-----------+---------+
| id                         | estRows  | actRows | task      | access object | execution info                                                                                                                                                                                                                                                                                                                                                 | operator info                   | memory    | disk    |
+----------------------------+----------+---------+-----------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+-----------+---------+
| HashAgg_11                 | 1.00     | 1       | root      |               | time:99.1ms, loops:2, RU:33.283250, partial_worker:{wall_time:99.112917ms, concurrency:5, task_num:0, tot_wait:495.484584ms, tot_exec:0s, tot_time:495.489583ms, max:99.100916ms, p95:99.100916ms}, final_worker:{wall_time:99.120083ms, concurrency:5, task_num:5, tot_wait:1.542µs, tot_exec:41ns, tot_time:495.515123ms, max:99.104291ms, p95:99.104291ms}  | funcs:count(Column#4)->Column#3 | 17.9 KB   | 0 Bytes |
| └─TableReader_12           | 1.00     | 0       | root      |               | time:99.1ms, loops:1, cop_task: {num: 1, max: 99.1ms, proc_keys: 0, tot_proc: 98.4ms, tot_wait: 85.3µs, rpc_num: 1, rpc_time: 99.1ms, copr_cache_hit_ratio: 0.00, build_task_duration: 11.3µs, max_distsql_concurrency: 1}                                                                                                                                     | data:HashAgg_5                  | 284 Bytes | N/A     |
|   └─HashAgg_5              | 1.00     | 0       | cop[tikv] |               | tikv_task:{time:99ms, loops:1}, scan_detail: {total_keys: 262145, get_snapshot_time: 44.8µs, rocksdb: {delete_skipped_count: 131072, key_skipped_count: 393216, block: {cache_hit_count: 3, read_count: 540, read_byte: 8.41 MB, read_time: 12.5ms}}}                                                                                                          | funcs:count(1)->Column#4        | N/A       | N/A     |
|     └─TableFullScan_10     | 10000.00 | 0       | cop[tikv] | table:t       | tikv_task:{time:99ms, loops:1}                                                                                                                                                                                                                                                                                                                                 | keep order:false, stats:pseudo  | N/A       | N/A     |
+----------------------------+----------+---------+-----------+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+-----------+---------+
D3Hunter commented 6 months ago

we can use tikv ctl to compact manually, after that there's no delete_skipped_count

we need compact default and write cf together in this case

mysql> mysql> explain analyze select * from t where a=19; // add condition to avoid coprocessor cache
+-------------------------+----------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------+-----------+------+
| id                      | estRows  | actRows | task      | access object | execution info                                                                                                                                                                                                                                  | operator info                  | memory    | disk |
+-------------------------+----------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------+-----------+------+
| TableReader_7           | 10.00    | 0       | root      |               | time:921.5µs, loops:1, RU:0.497153, cop_task: {num: 1, max: 854.4µs, proc_keys: 0, tot_proc: 66.5µs, tot_wait: 180.2µs, rpc_num: 1, rpc_time: 818.6µs, copr_cache_hit_ratio: 0.00, build_task_duration: 10µs, max_distsql_concurrency: 1}       | data:Selection_6               | 243 Bytes | N/A  |
| └─Selection_6           | 10.00    | 0       | cop[tikv] |               | tikv_task:{time:0s, loops:1}, scan_detail: {total_keys: 1, get_snapshot_time: 142.5µs, rocksdb: {block: {cache_hit_count: 2}}}                                                                                                                  | eq(test.t.a, 19)               | N/A       | N/A  |
|   └─TableFullScan_5     | 10000.00 | 0       | cop[tikv] | table:t       | tikv_task:{time:0s, loops:1}                                                                                                                                                                                                                    | keep order:false, stats:pseudo | N/A       | N/A  |
+-------------------------+----------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------+-----------+------+
3 rows in set (0.00 sec)
lance6716 commented 4 days ago

maybe related to https://github.com/tikv/tikv/issues/17269