Open r33s3n6 opened 2 weeks ago
/label affects-5.4
/label affects-6.1
/label affects-6.5
/label affects-7.1
/label affects-7.5
/assign @SeaRise
Reproduced on a single node, after creating the table and inserting data, wait a while. Also, the plan shows it's unrelated to the number of TiKV nodes.
+--------------------------------------+---------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+--------------------------------------+---------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
| Union_14 | 512.00 | 512 | root | | time:3.69ms, loops:7, RU:2.539922 | | N/A | N/A |
| ├─Projection_17 | 4.00 | 4 | root | | time:931.8µs, loops:2, Concurrency:OFF | test.t_j.c_tk->Column#21, test.t_j.c_epy6r_00o8->Column#22, test.t_j.c_tk->Column#23, test.t_j.c_x49cz->Column#24, test.t_j.c_qp->Column#25 | 1.73 KB | N/A |
| │ └─TableReader_20 | 4.00 | 4 | root | | time:926.1µs, loops:2, cop_task: {num: 1, max: 848.6µs, proc_keys: 4, tot_proc: 120.7µs, tot_wait: 94.2µs, copr_cache_hit_ratio: 0.00, build_task_duration: 11.7µs, max_distsql_concurrency: 1}, rpc_info:{Cop:{num_rpc:1, total_time:815.6µs}} | data:TableFullScan_19 | 423 Bytes | N/A |
| │ └─TableFullScan_19 | 4.00 | 4 | cop[tikv] | table:ref_0 | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 4, total_process_keys_size: 245, total_keys: 5, get_snapshot_time: 44.8µs, rocksdb: {delete_skipped_count: 4, key_skipped_count: 8, block: {cache_hit_count: 2}}}, time_detail: {total_process_time: 120.7µs, total_wait_time: 94.2µs, tikv_wall_time: 397.6µs} | keep order:false | N/A | N/A |
| └─Projection_30 | 508.00 | 508 | root | | time:3.51ms, loops:6, Concurrency:OFF | cast(Column#13, text CHARACTER SET utf8mb4 COLLATE utf8mb4_bin)->Column#21, Column#15->Column#22, test.t_ufims7.c__w->Column#23, test.t_ufims7.c_h20voo1018->Column#24, test.t_ufims7.c_a4f1v5d3q2->Column#25 | 14.2 KB | N/A |
| └─Shuffle_37 | 508.00 | 508 | root | | time:3.45ms, loops:6, ShuffleConcurrency:5 | execution info: concurrency:5, data sources:[Projection_33] | N/A | N/A |
| └─Window_31 | 508.00 | 508 | root | | time:17.2ms, loops:10 | var_pop(<nil>)->Column#15 over(partition by test.t_ufims7.c_qbi13s order by test.t_ufims7.c_qbi13s, test.t_ufims7.c_a4f1v5d3q2, test.t_ufims7.c_yhcnkidi8, test.t_ufims7.c_qm, test.t_ufims7.c_h20voo1018, test.t_ufims7.c__w, test.t_ufims7.c_xxn6xl2q range between unbounded preceding and current row) | N/A | N/A |
| └─Sort_36 | 508.00 | 508 | root | | time:16.6ms, loops:10 | test.t_ufims7.c_qbi13s, test.t_ufims7.c_qbi13s, test.t_ufims7.c_a4f1v5d3q2, test.t_ufims7.c_yhcnkidi8, test.t_ufims7.c_qm, test.t_ufims7.c_h20voo1018, test.t_ufims7.c__w, test.t_ufims7.c_xxn6xl2q | 14.5 KB | 0 Bytes |
| └─ShuffleReceiver_38 | 508.00 | 508 | root | | time:16.3ms, loops:10 | | N/A | N/A |
| └─Projection_33 | 508.00 | 508 | root | | time:3.12ms, loops:2, Concurrency:OFF | 1->Column#13, test.t_ufims7.c__w, test.t_ufims7.c_h20voo1018, test.t_ufims7.c_a4f1v5d3q2, test.t_ufims7.c_qbi13s, test.t_ufims7.c_yhcnkidi8, test.t_ufims7.c_qm, test.t_ufims7.c_xxn6xl2q | 36.4 KB | N/A |
| └─TableReader_35 | 508.00 | 508 | root | | time:3.1ms, loops:2, cop_task: {num: 2, max: 1.64ms, min: 1.38ms, avg: 1.51ms, p95: 1.64ms, max_proc_keys: 284, p95_proc_keys: 284, tot_proc: 1.39ms, tot_wait: 212.4µs, copr_cache_hit_ratio: 0.00, build_task_duration: 7.68µs, max_distsql_concurrency: 1}, rpc_info:{Cop:{num_rpc:2, total_time:2.96ms}} | data:TableFullScan_34 | 31.1 KB | N/A |
| └─TableFullScan_34 | 508.00 | 508 | cop[tikv] | table:ref_2 | tikv_task:{proc max:1ms, min:0s, avg: 500µs, p80:1ms, p95:1ms, iters:7, tasks:2}, scan_detail: {total_process_keys: 508, total_process_keys_size: 39783, total_keys: 902, get_snapshot_time: 126µs, rocksdb: {delete_skipped_count: 421, key_skipped_count: 1711, block: {cache_hit_count: 4}}}, time_detail: {total_process_time: 1.39ms, total_wait_time: 212.4µs, total_kv_read_wall_time: 1ms, tikv_wall_time: 1.93ms} | keep order:false, stats:pseudo | N/A | N/A |
+--------------------------------------+---------+---------+-----------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+---------+
The result is correct after setting tidb_executor_concurrency to 1, which should be a concurrency issue in TiDB.
And then setting tidb_window_concurrency
to 100, the issue occurred, likely related to window shuffle.
Replacing var_pop
with count
also resulted in an error, indicating the issue is unrelated to var_pop.
Simplified case
use test;
drop table if exists t1;
create table t1 (cc1 int,cc2 text);
insert into t1 values (1, 'aaaa'),(2, 'bbbb'),(3, 'cccc');
drop table if exists t2;
create table t2 (cc1 int,cc2 text,primary key(cc1));
insert into t2 values (2, '2');
set tidb_executor_concurrency = 1;
set tidb_window_concurrency = 100;
SELECT DISTINCT cc2, cc2, cc1 FROM t2 UNION ALL SELECT count(1) over (partition by cc1), cc2, cc1 FROM t1;
+------+------+------+
| cc2 | cc2 | cc1 |
+------+------+------+
| 2 | 2 | 2 |
| 1 | cccc | 3 |
| 1 | bbbb | 2 |
| aaaa | aaaa | 1 |
+------+------+------+
/severity major
/unlabel severity/critical
/remove-severity critical
The severity is downgraded to major due to the stringent reproduction conditions.
use test;
drop table if exists t1;
create table t1 (cc1 int,cc2 text);
insert into t1 values (1, 'aaaa'),(2, 'bbbb'),(3, 'cccc');
drop table if exists t2;
create table t2 (cc1 int,cc2 text,primary key(cc1));
insert into t2 values (2, '2');
set tidb_executor_concurrency = 1;
set tidb_projection_concurrency = 1;
set tidb_window_concurrency = 100;
## with optimize rules 'aggregation_eliminate' and 'aggregation_push_down'
delete from mysql.opt_rule_blacklist;
admin reload opt_rule_blacklist;
explain SELECT DISTINCT cc2, cc2, cc1 FROM t2 UNION ALL SELECT count(1) over (partition by cc1), cc2, cc1 FROM t1;
+------------------------------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+------------------------------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------+
| Union_13 | 20000.00 | root | | |
| ├─Projection_15 | 10000.00 | root | | test.t2.cc2->Column#8, test.t2.cc2->Column#9, test.t2.cc1->Column#10 |
| │ └─TableReader_17 | 10000.00 | root | | data:TableFullScan_16 |
| │ └─TableFullScan_16 | 10000.00 | cop[tikv] | table:t2 | keep order:false, stats:pseudo |
| └─Projection_18 | 10000.00 | root | | cast(Column#7, text BINARY CHARACTER SET utf8mb4 COLLATE utf8mb4_bin)->Column#8, test.t1.cc2->Column#9, test.t1.cc1->Column#10 |
| └─Shuffle_23 | 10000.00 | root | | execution info: concurrency:100, data sources:[TableReader_21] |
| └─Window_19 | 10000.00 | root | | count(1)->Column#7 over(partition by test.t1.cc1) |
| └─Sort_22 | 10000.00 | root | | test.t1.cc1 |
| └─ShuffleReceiver_24 | 10000.00 | root | | |
| └─TableReader_21 | 10000.00 | root | | data:TableFullScan_20 |
| └─TableFullScan_20 | 10000.00 | cop[tikv] | table:t1 | keep order:false, stats:pseudo |
+------------------------------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------+
SELECT DISTINCT cc2, cc2, cc1 FROM t2 UNION ALL SELECT count(1) over (partition by cc1), cc2, cc1 FROM t1;
+------+------+------+
| cc2 | cc2 | cc1 |
+------+------+------+
| 2 | 2 | 2 |
| 1 | bbbb | 2 |
| 1 | cccc | 3 |
| aaaa | aaaa | 1 |
+------+------+------+
## without optimize rules 'aggregation_eliminate' and 'aggregation_push_down'
insert into mysql.opt_rule_blacklist values('aggregation_eliminate'), ('aggregation_push_down');
admin reload opt_rule_blacklist;
explain SELECT DISTINCT cc2, cc2, cc1 FROM t2 UNION ALL SELECT count(1) over (partition by cc1), cc2, cc1 FROM t1;
+------------------------------------+---------+-----------+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+------------------------------------+---------+-----------+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Union_12 | 4.00 | root | | |
| ├─HashAgg_16 | 1.00 | root | | group by:test.t2.cc1, test.t2.cc2, test.t2.cc2, funcs:firstrow(test.t2.cc2)->test.t2.cc2, funcs:firstrow(test.t2.cc2)->test.t2.cc2, funcs:firstrow(test.t2.cc1)->test.t2.cc1 |
| │ └─TableReader_21 | 1.00 | root | | data:TableFullScan_20 |
| │ └─TableFullScan_20 | 1.00 | cop[tikv] | table:t2 | keep order:false, stats:pseudo |
| └─Projection_22 | 3.00 | root | | cast(Column#7, text BINARY CHARACTER SET utf8mb4 COLLATE utf8mb4_bin)->Column#8, test.t1.cc2->Column#9, test.t1.cc1->Column#10 |
| └─Shuffle_27 | 3.00 | root | | execution info: concurrency:2, data sources:[TableReader_25] |
| └─Window_23 | 3.00 | root | | count(1)->Column#7 over(partition by test.t1.cc1) |
| └─Sort_26 | 3.00 | root | | test.t1.cc1 |
| └─ShuffleReceiver_28 | 3.00 | root | | |
| └─TableReader_25 | 3.00 | root | | data:TableFullScan_24 |
| └─TableFullScan_24 | 3.00 | cop[tikv] | table:t1 | keep order:false, stats:pseudo |
+------------------------------------+---------+-----------+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
SELECT DISTINCT cc2, cc2, cc1 FROM t2 UNION ALL SELECT count(1) over (partition by cc1), cc2, cc1 FROM t1;
+------+------+------+
| cc2 | cc2 | cc1 |
+------+------+------+
| 2 | 2 | 2 |
| 1 | bbbb | 2 |
| 1 | cccc | 3 |
| 1 | aaaa | 1 |
+------+------+------+
The result error seems to be caused by the optimizer's rule that eliminates distinct aggregates.
It appears that the result misalignment in the second subquery of the UNION ALL is related to the first subquery's SELECT cc2, cc2
.
/sig planner
/remove-sig execution
Manually eliminating distinct agg also yields correct results.
use test;
drop table if exists t1;
create table t1 (cc1 int,cc2 text);
insert into t1 values (1, 'aaaa'),(2, 'bbbb'),(3, 'cccc');
drop table if exists t2;
create table t2 (cc1 int,cc2 text,primary key(cc1));
insert into t2 values (2, '2');
set tidb_executor_concurrency = 1;
set tidb_projection_concurrency = 1;
set tidb_window_concurrency = 100;
explain SELECT cc2, cc2, cc1 FROM t2 UNION ALL SELECT count(1) over (partition by cc1), cc2, cc1 FROM t1;
+------------------------------------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+------------------------------------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------+
| Union_11 | 4.00 | root | | |
| ├─Projection_12 | 1.00 | root | | test.t2.cc2->Column#8, test.t2.cc2->Column#9, test.t2.cc1->Column#10 |
| │ └─TableReader_14 | 1.00 | root | | data:TableFullScan_13 |
| │ └─TableFullScan_13 | 1.00 | cop[tikv] | table:t2 | keep order:false, stats:pseudo |
| └─Projection_15 | 3.00 | root | | cast(Column#7, text BINARY CHARACTER SET utf8mb4 COLLATE utf8mb4_bin)->Column#8, test.t1.cc2->Column#9, test.t1.cc1->Column#10 |
| └─Shuffle_20 | 3.00 | root | | execution info: concurrency:2, data sources:[TableReader_18] |
| └─Window_16 | 3.00 | root | | count(1)->Column#7 over(partition by test.t1.cc1) |
| └─Sort_19 | 3.00 | root | | test.t1.cc1 |
| └─ShuffleReceiver_21 | 3.00 | root | | |
| └─TableReader_18 | 3.00 | root | | data:TableFullScan_17 |
| └─TableFullScan_17 | 3.00 | cop[tikv] | table:t1 | keep order:false, stats:pseudo |
+------------------------------------+---------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------+
SELECT cc2, cc2, cc1 FROM t2 UNION ALL SELECT count(1) over (partition by cc1), cc2, cc1 FROM t1;
+------+------+------+
| cc2 | cc2 | cc1 |
+------+------+------+
| 2 | 2 | 2 |
| 1 | bbbb | 2 |
| 1 | cccc | 3 |
| 1 | aaaa | 1 |
+------+------+------+
1. Minimal reproduce step (Required)
Firstly, execute
init.sql
to create the table. Then executingerror.sql
yields unexpected results. Note that reproducing these results might not be entirely stable. Typically, it can be completed within three attempts. You can try executingerror.sql
multiple times or executeinit.sql
again to rebuild the table.init.sql.txt error.sql.txt
2. What did you expect to see? (Required)
The SQL statement consists of a UNION of two queries. The first query has four rows with the first column being respectively
oyz4u
,wcnmm
,z36ma
, andtl
. The first column of the second query is a constant string '1'.3. What did you see instead (Required)
In the multi-node version, some results other than '1' are outputted, while the single-node version behaves normally.
output_re_main2.log output_re_single2.log
4. What is your TiDB version? (Required)
topology:
distributed.yaml:
single.yaml
about us
We are the BASS team from the School of Cyber Science and Technology at Beihang University. Our main focus is on system software security, operating systems, and program analysis research, as well as the development of automated program testing frameworks for detecting software defects. Using our self-developed database vulnerability testing tool, we have identified the above-mentioned vulnerabilities in TiDB that may lead to database logic error.