Open r33s3n6 opened 6 months ago
The plan looks all right, var_pop is executed in tidb side, the implementation of var_pop seems incorrect:
mysql> source error.sql.txt
+----------------------------------------+---------+--------------+---------------+------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+----------------------------------------+---------+--------------+---------------+------------------------------------------------------------------------------------------------------------------------------+
| HashAgg_15 | 249.60 | root | | group by:Column#17, funcs:var_pop(6.041025216704858e+17)->Column#18 |
| └─TableReader_38 | 249.60 | root | | MppVersion: 2, data:ExchangeSender_37 |
| └─ExchangeSender_37 | 249.60 | mpp[tiflash] | | ExchangeType: PassThrough |
| └─Projection_33 | 249.60 | mpp[tiflash] | | Column#17 |
| └─HashAgg_34 | 249.60 | mpp[tiflash] | | group by:test.t_ufims7.c__w, funcs:sum(Column#19)->Column#17, stream_count: 4 |
| └─ExchangeReceiver_36 | 249.60 | mpp[tiflash] | | stream_count: 4 |
| └─ExchangeSender_35 | 249.60 | mpp[tiflash] | | ExchangeType: HashPartition, Compression: FAST, Hash Cols: [name: test.t_ufims7.c__w, collate: utf8mb4_bin], stream_count: 4 |
| └─HashAgg_20 | 249.60 | mpp[tiflash] | | group by:Column#22, funcs:count(Column#21)->Column#19 |
| └─Projection_42 | 312.00 | mpp[tiflash] | | cast(test.t_ufims7.c_yhcnkidi8, bigint(22) BINARY)->Column#21, test.t_ufims7.c__w->Column#22 |
| └─TableFullScan_32 | 312.00 | mpp[tiflash] | table:ref_4 | keep order:false, stats:pseudo |
+----------------------------------------+---------+--------------+---------------+------------------------------------------------------------------------------------------------------------------------------+
/label affects-5.4
/label affects-6.1
/label affects-6.5
/label affects-7.1
/label affects-7.5
/assign @SeaRise
Simplified case
use test;
drop table if exists test;
create table test (c int);
insert into test values(0),(0),(0),(0),(0),(0);
SELECT var_pop(604102521670485727) FROM test;
// mysql
mysql> SELECT var_pop(604102521670485727) FROM test;
+-----------------------------+
| var_pop(604102521670485727) |
+-----------------------------+
| 0 |
+-----------------------------+
// tidb-v8.2.0-alpha
mysql> SELECT var_pop(604102521670485727) FROM test;
+-----------------------------+
| var_pop(604102521670485727) |
+-----------------------------+
| 1456.3555555555556 |
+-----------------------------+
The correct result of 3020512608352429000 + 604102521670485800
is 3624615130022914800
However, when using float64, the results are incorrect in both Golang and C++.
/label severity/minor
@SeaRise: The label(s) severity/minor
cannot be applied. These labels are supported: fuzz/sqlancer, challenge-program, compatibility-breaker, first-time-contributor, contribution, good first issue, correctness, duplicate, proposal, security, ok-to-test, needs-ok-to-test, needs-more-info, needs-cherry-pick-release-5.4, needs-cherry-pick-release-6.1, needs-cherry-pick-release-6.5, needs-cherry-pick-release-7.1, needs-cherry-pick-release-7.5, needs-cherry-pick-release-8.1, affects-5.4, affects-6.1, affects-6.5, affects-7.1, affects-7.5, affects-8.1, may-affects-5.4, may-affects-6.1, may-affects-6.5, may-affects-7.1, may-affects-7.5, may-affects-8.1
.
/severity minor
The root cause is an issue with float64 addition precision. As mentioned in the previous comment, the sum of 3020512608352429000 and 604102521670485800
is incorrect in Golang, which leads to an incorrect result for var_pop.
After examining MySQL's implementation, it also uses float64 for var_pop, but with a different algorithm than TiDB. Still, it's theoretically possible to find a case that yields an incorrect result in MySQL.
The severity is downgraded to minor due to the complex potential fix, rare usage mentioned in the issue, and the non-precision nature of var_pop itself.
1. Minimal reproduce step (Required)
Firstly, execute
init.sql
to create the table. Then executingerror.sql
yields unexpected results. Note that reproducing these results might not be entirely stable. Typically, it can be completed within three attempts. You can try executingerror.sql
multiple times or executeinit.sql
again to rebuild the table.init.sql.txt error.sql.txt
2. What did you expect to see? (Required)
The SQL statement calculates the var_pop value of the constant expression
604102521670485727
, and the result should be 0.3. What did you see instead (Required)
However, the statement results are unstable; in both the multi-node and single-node versions, there are some 0 and non-zero values.
output_re_main2.log output_re_single2.log
4. What is your TiDB version? (Required)
topology:
distributed.yaml:
single.yaml
about us
We are the BASS team from the School of Cyber Science and Technology at Beihang University. Our main focus is on system software security, operating systems, and program analysis research, as well as the development of automated program testing frameworks for detecting software defects. Using our self-developed database vulnerability testing tool, we have identified the above-mentioned vulnerabilities in TiDB that may lead to database logic error.