pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
36.65k stars 5.77k forks source link

Feature: support push-down bloom filter for join #15351

Open wshwsh12 opened 4 years ago

wshwsh12 commented 4 years ago

Feature Request

Is your feature request related to a problem? Please describe:

Join exectuor‘s performance can be optimized. We can use bloom filter to filter the data from join's probe side to reduce network overhead and computation overhead.

Describe the feature you'd like:

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:

Paper : Looking Ahead Makes Query Plans Robust http://www.vldb.org/pvldb/vol10/p889-zhu.pdf

dbsid commented 3 years ago

Seems the development for this feature is on hold. For analytical query, this feature is critical for performance improvement.

For example the plan snippet from TPCDS query71.sql, for the hash join between date_dim and store_sales, the majority of elapsed time is on the TableReader and the following HashJoin. If bloom filter pushdown(to TiFlash or TiKV) is implemented. The actual rows the TiDB need to read from TiFlash maybe reduced from 2880404 to nearly 92752, which might improve the hash join performance dramatically.

|       │ │   └─HashJoin_114                        | 0.00       | 92752   | root         |                     | time:1.568344807s, loops:94, build_hash_table:{total:75.468632ms, fetch:75.451881ms, build:16.751µs}, probe:{concurrency:5, total:7.830360633s, max:1.585215382s, probe:1.08864604s, fetch:6.741714593s}                                                                 | inner join, equal:[eq(tpcds.date_dim.d_date_sk, tpcds.store_sales.ss_sold_date_sk)]                                                                                                                                                                                                                                                                                                         | 24.703125 KB          | 0 Bytes |
|       │ │     ├─TableReader_130(Build)            | 0.00       | 31      | root         |                     | time:75.418557ms, loops:2, cop_task: {num: 1, max:75.762799ms, proc_keys: 0, rpc_num: 1, rpc_time: 75.758431ms, copr_cache_hit_ratio: 0.00}                                                                                                                              | data:Selection_129                                                                                                                                                                                                                                                                                                                                                                          | 968 Bytes             | N/A     |
|       │ │     │ └─Selection_129                   | 0.00       | 31      | cop[tiflash] |                     | time:15.038987ms, loops:1                                                                                                                                                                                                                                                | eq(tpcds.date_dim.d_moy, 12), eq(tpcds.date_dim.d_year, 2000)                                                                                                                                                                                                                                                                                                                               | N/A                   | N/A     |
|       │ │     │   └─TableFullScan_128             | 73049.00   | 73049   | cop[tiflash] | table:date_dim      | time:15.038987ms, loops:2                                                                                                                                                                                                                                                | keep order:false                                                                                                                                                                                                                                                                                                                                                                            | N/A                   | N/A     |
|       │ │     └─TableReader_120(Probe)            | 2880404.00 | 2880404 | root         |                     | time:1.335566026s, loops:2818, cop_task: {num: 6, max: 1.534957884s, min: 1.024621609s, avg: 1.29958119s, p95: 1.534957884s, rpc_num: 6, rpc_time: 7.797425596s, copr_cache_hit_ratio: 0.00}                                                                             | data:Selection_119                                                                                                                                                                                                                                                                                                                                                                          | 66.03593063354492 MB  | N/A     |
|       │ │       └─Selection_119                   | 2880404.00 | 2880404 | cop[tiflash] |                     | proc max:303.179484ms, min:104.655019ms, p80:262.17598ms, p95:303.179484ms, iters:46, tasks:6                                                                                                                                                                            | not(isnull(tpcds.store_sales.ss_item_sk)), not(isnull(tpcds.store_sales.ss_sold_date_sk)), not(isnull(tpcds.store_sales.ss_sold_time_sk))                                                                                                                                                                                                                                                   | N/A                   | N/A     |
|       │ │         └─TableFullScan_118             | 2880404.00 | 2880404 | cop[tiflash] | table:store_sales   | proc max:290.179245ms, min:62.254286ms, p80:164.174192ms, p95:290.179245ms, iters:46, tasks:6                                                                                                                                                                            | keep order:false                                                                                                                                                                                                                                                                                                                                                                            | N/A                   | N/A     |