pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
36.9k stars 5.81k forks source link

large overestimation when where conditions contain OR and matches several indexes with different selectivity #54323

Open time-and-fate opened 2 months ago

time-and-fate commented 2 months ago

Reproduce

create table t(a int, b int, c int, d int, index iabc(a,b,c), index ib(b));

-- run this many times
insert into t value (rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000),(rand()*100000, rand()*10, rand()*1000, rand()*1000);

-- run this many times
insert into t select * from t;

analyze table t;
explain analyze select * from t where (a = 1 and b = 1 and c = 1) or (a = 2 and b = 2 and c = 2) or (a = 3 and b = 3 and c = 3);
explain analyze select * from t use index (iabc) where (a = 1 and b = 1 and c = 1) or (a = 2 and b = 2 and c = 2) or (a = 3 and b = 3 and c = 3);

There is a big overestimation

> explain analyze select * from t where (a = 1 and b = 1 and c = 1) or (a = 2 and b = 2 and c = 2) or (a = 3 and b = 3 and c = 3);
+-------------------------+----------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+
| id                      | estRows  | actRows | task      | access object | execution info                                                                                                                                                                                                                                                                                                                                                                                  | operator info                                                                                                                                                                                 | memory    | disk |
+-------------------------+----------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+
| TableReader_7           | 7014.40  | 0       | root      |               | time:36.3ms, loops:1, RU:36.071812, cop_task: {num: 1, max: 36.2ms, proc_keys: 29440, tot_proc: 35.2ms, tot_wait: 70.6µs, copr_cache_hit_ratio: 0.00, build_task_duration: 10.9µs, max_distsql_concurrency: 1}, rpc_info:{Cop:{num_rpc:1, total_time:36.1ms}}                                                                                                                                   | data:Selection_6                                                                                                                                                                              | 289 Bytes | N/A  |
| └─Selection_6           | 7014.40  | 0       | cop[tikv] |               | tikv_task:{time:37ms, loops:33}, scan_detail: {total_process_keys: 29440, total_process_keys_size: 1564192, total_keys: 29441, get_snapshot_time: 27.9µs, rocksdb: {delete_skipped_count: 28520, key_skipped_count: 57960, block: {}}}, time_detail: {total_process_time: 35.2ms, total_suspend_time: 202µs, total_wait_time: 70.6µs, total_kv_read_wall_time: 27ms, tikv_wall_time: 35.7ms}    | or(and(eq(test.t.a, 1), and(eq(test.t.b, 1), eq(test.t.c, 1))), or(and(eq(test.t.a, 2), and(eq(test.t.b, 2), eq(test.t.c, 2))), and(eq(test.t.a, 3), and(eq(test.t.b, 3), eq(test.t.c, 3))))) | N/A       | N/A  |
|   └─TableFullScan_5     | 29440.00 | 29440   | cop[tikv] | table:t       | tikv_task:{time:27ms, loops:33}                                                                                                                                                                                                                                                                                                                                                                 | keep order:false                                                                                                                                                                              | N/A       | N/A  |
+-------------------------+----------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+------+
3 rows in set (0.04 sec)

> explain analyze select * from t use index (iabc) where (a = 1 and b = 1 and c = 1) or (a = 2 and b = 2 and c = 2) or (a = 3 and b = 3 and c = 3);
+-------------------------------+---------+---------+-----------+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------+-----------+------+
| id                            | estRows | actRows | task      | access object                | execution info                                                                                                                                                                                                                                                                                                                                                                                                                                                 | operator info                                                       | memory    | disk |
+-------------------------------+---------+---------+-----------+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------+-----------+------+
| IndexLookUp_7                 | 8768.00 | 0       | root      |                              | time:762.9µs, loops:1, RU:0.496309                                                                                                                                                                                                                                                                                                                                                                                                                             |                                                                     | 257 Bytes | N/A  |
| ├─IndexRangeScan_5(Build)     | 8768.00 | 0       | cop[tikv] | table:t, index:iabc(a, b, c) | time:607.9µs, loops:1, cop_task: {num: 1, max: 532.8µs, proc_keys: 0, tot_proc: 63.9µs, tot_wait: 47.2µs, copr_cache_hit_ratio: 0.00, build_task_duration: 28.2µs, max_distsql_concurrency: 1}, rpc_info:{Cop:{num_rpc:1, total_time:509.9µs}}, tikv_task:{time:0s, loops:1}, scan_detail: {total_keys: 3, get_snapshot_time: 24µs, rocksdb: {block: {}}}, time_detail: {total_process_time: 63.9µs, total_wait_time: 47.2µs, tikv_wall_time: 206µs}           | range:[1 1 1,1 1 1], [2 2 2,2 2 2], [3 3 3,3 3 3], keep order:false | N/A       | N/A  |
| └─TableRowIDScan_6(Probe)     | 8768.00 | 0       | cop[tikv] | table:t                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                | keep order:false                                                    | N/A       | N/A  |
+-------------------------------+---------+---------+-----------+------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------+-----------+------+
time-and-fate commented 2 months ago

This corresponds to two places that need to be improved:

  1. The root cause In this case, Selectivity() can't correctly match the stats to the filters. It should prefer stats on (iabc) for estimation, but finally chooses stats on (ib).
  2. The root cause results in the overestimation of DataSource.stats.RowCount, and then the logic here brings the overestimation into the index range scan of index (iabc), then causes choosing the bad plan. https://github.com/pingcap/tidb/blob/854a4e3303003e3f8c1151da27e539337dcf8277/pkg/planner/core/logical_plans.go#L1599-L1601
Rustin170506 commented 1 month ago

A simple rust script to reproduce it:

#!/usr/bin/env -S cargo +nightly -Zscript
---cargo
[dependencies]
sqlx = { version = "0.7", features = ["mysql", "runtime-tokio-native-tls"] }
tokio = { version = "1", features = ["full"] }
rand = "0.8"
---
use sqlx::mysql::MySqlPool;
use rand::Rng;
use std::error::Error;

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    // Replace with your MySQL connection string
    let pool = MySqlPool::connect("mysql://root@localhost:4000/test").await?;

    // Create table
    sqlx::query(
        "CREATE TABLE IF NOT EXISTS t(
            a INT,
            b INT,
            c INT,
            d INT,
            INDEX iabc(a,b,c),
            INDEX ib(b)
        )"
    )
    .execute(&pool)
    .await?;

    // Function to generate random data
    fn generate_data() -> (i32, i32, i32, i32) {
        let mut rng = rand::thread_rng();
        (
            rng.gen_range(0..100000),
            rng.gen_range(0..10),
            rng.gen_range(0..1000),
            rng.gen_range(0..1000),
        )
    }

    // Insert initial data
    for _ in 0..200 {
        let data: Vec<_> = (0..20).map(|_| generate_data()).collect();

        for (a, b, c, d) in data {
            sqlx::query("INSERT INTO t (a, b, c, d) VALUES (?, ?, ?, ?)")
                .bind(a)
                .bind(b)
                .bind(c)
                .bind(d)
                .execute(&pool)
                .await?;
        }
    }

    // Double the data multiple times
    for _ in 0..5 {  // Adjust this number based on how much data you want
        sqlx::query("INSERT INTO t SELECT * FROM t")
            .execute(&pool)
            .await?;
    }

    // Analyze table
    sqlx::query("ANALYZE TABLE t")
        .execute(&pool)
        .await?;

    println!("Script completed successfully.");
    Ok(())
}
Rustin170506 commented 1 month ago
  1. In this case, Selectivity() can't correctly match the stats to the filters. It should prefer stats on (iabc) for estimation, but finally chooses stats on (ib).

GetUsableSetsByGreedy

// We greedy select the stats info based on:
// (1): The stats type, always prefer the primary key or index.
// (2): The number of expression that it covers, the more the better.
// (3): The number of columns that it contains, the less the better.
// (4): The selectivity of the covered conditions, the less the better.
//      The rationale behind is that lower selectivity tends to reflect more functional dependencies
//      between columns. It's hard to decide the priority of this rule against rule 2 and 3, in order
//      to avoid massive plan changes between tidb-server versions, I adopt this conservative strategy
//      to impose this rule after rule 2 and 3.
if (bestTp == ColType && set.Tp != ColType) ||
    bestCount < bits ||
    (bestCount == bits && bestNumCols > set.numCols) ||
    (bestCount == bits && bestNumCols == set.numCols && bestSel > set.Selectivity) {
    bestID, bestCount, bestTp, bestNumCols, bestMask, bestSel = i, bits, set.Tp, set.numCols, curMask, set.Selectivity
}

This is because that the column number of the stats on (ib) is less than the stats on (iabc), so it chooses the stats on (ib) first.

Rustin170506 commented 1 month ago

The steps:

  1. GetUsableSetsByGreedy uses incorrect statistics to estimate the selectivity of the indexes.
  2. deriveIndexPathStats employs an unperfect algorithm to update the CountAfterAccess.
  3. skylinePruning prunes the ib index due to the compareCandidates.
  4. compareTaskCost chooses the table full scan due to the overestimated CountAfterAccess.