pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
37.07k stars 5.83k forks source link

Improve the performance of `NestedLoopApplyExec` #12939

Open zz-jason opened 4 years ago

zz-jason commented 4 years ago

Description

NestedLoopApplyExec is used to calculate the unnested subqueries. Taking this SQL as an example:

create table t1(a bigint, b bigint);
create table t2(a bigint, b bigint);
-- insert some records into table t1 and t2
select * from t1 where t1.b > (select min(b) from t2 where t1.a > t2.a);

The above select statement is converted to an execution plan which contains an Apply executor:

TiDB(root@127.0.0.1:test) > explain select * from t1 where t1.b > (select min(b) from t2 where t1.a > t2.a);
+----------------------------------+----------+-----------+-----------------------------------------------------------------------------+
| id                               | count    | task      | operator info                                                               |
+----------------------------------+----------+-----------+-----------------------------------------------------------------------------+
| Projection_15                    | 9990.00  | root      | Column#1, Column#2                                                          |
| └─Apply_17                       | 9990.00  | root      | CARTESIAN inner join, inner:Selection_21, other cond:gt(Column#2, Column#7) |
|   ├─TableReader_20               | 9990.00  | root      | data:Selection_19                                                           |
|   │ └─Selection_19               | 9990.00  | cop[tikv] | not(isnull(Column#2))                                                       |
|   │   └─TableScan_18             | 10000.00 | cop[tikv] | table:t1, range:[-inf,+inf], keep order:false, stats:pseudo                 |
|   └─Selection_21                 | 0.80     | root      | not(isnull(Column#7))                                                       |
|     └─StreamAgg_26               | 1.00     | root      | funcs:min(Column#5)                                                         |
|       └─TopN_27                  | 1.00     | root      | Column#5:asc, offset:0, count:1                                             |
|         └─TableReader_36         | 1.00     | root      | data:TopN_35                                                                |
|           └─TopN_35              | 1.00     | cop[tikv] | Column#5:asc, offset:0, count:1                                             |
|             └─Selection_34       | 7992.00  | cop[tikv] | gt(Column#1, Column#4), not(isnull(Column#5))                               |
|               └─TableScan_33     | 10000.00 | cop[tikv] | table:t2, range:[-inf,+inf], keep order:false, stats:pseudo                 |
+----------------------------------+----------+-----------+-----------------------------------------------------------------------------+
12 rows in set (0.00 sec)

The Apply_17 operator in the above execution plan is executed by the NestedLoopApplyExec physical operator. Your goal is to optimize this physical implementation to improve the overall sql execution, reduce at least 100% execution time.

Goals:

Non-Goals:

Difficulty

Score

Mentor(s)

Recommended Skills

miyuri-fox commented 4 years ago

/pick-up-challenge

sre-bot commented 4 years ago

@miyuri-fox don't have enough score, pick up failed Progress 0/400 You may pick up some easy issues first.

TennyZhuang commented 4 years ago

reduce at least 100% execution time.

You mean no-op?

ekalinin commented 4 years ago

/pick-up-challenge

sre-bot commented 4 years ago

@ekalinin pick up issue success

ekalinin commented 4 years ago

/give-up-challenge

sre-bot commented 4 years ago

@ekalinin give up issue success

qw4990 commented 4 years ago

image This SQL can be converted to a HashJoin in CRDB. Maybe we can learn something from this.