Closed r33s3n6 closed 4 days ago
The error.sql.txt query's plan looks incorrect:
+--------------------------+----------+-----------+-----------------------------------------+--------------------------------+
| id | estRows | task | access object | operator info |
+--------------------------+----------+-----------+-----------------------------------------+--------------------------------+
| Projection_47 | 10000.00 | root | | 1->Column#35 |
| └─IndexReader_51 | 10000.00 | root | | index:IndexFullScan_50 |
| └─IndexFullScan_50 | 10000.00 | cop[tikv] | table:ref_2, index:c_uajclsr(c_uajclsr) | keep order:false, stats:pseudo |
+--------------------------+----------+-----------+-----------------------------------------+--------------------------------+
/sig planner
/remove-sig execution
The core issue here is the (select variance(c_k6tgklu10) from t_rl02)
subquery.
Since its output column type is DOUBLE
, and tidb executes SQLs with intra-query concurrency, the order of calculations that happen in the query would be unstable, therefore the execution result will be unstable due to the precision loss of the floating number.
> select variance(c_k6tgklu10) from t_rl02;
+-----------------------+
| variance(c_k6tgklu10) |
+-----------------------+
| 0.24595041322314046 |
+-----------------------+
1 row in set (0.00 sec)
> select variance(c_k6tgklu10) from t_rl02;
+-----------------------+
| variance(c_k6tgklu10) |
+-----------------------+
| 0.24595041322314048 |
+-----------------------+
1 row in set (0.00 sec)
> select variance(c_k6tgklu10) from t_rl02;
+-----------------------+
| variance(c_k6tgklu10) |
+-----------------------+
| 0.24595041322314043 |
+-----------------------+
1 row in set (0.00 sec)
That makes the two same subqueries in the SQL may have different results. Then the nullif
expression will be 1
sometimes and 0
otherwise. Finally, the query result would be unstable. Most of the time it will be all 1
s, but sometimes it would also be 0
s.
MySQL document (B.3.4.8 Problems with Floating-Point Values) also has descriptions on this issue and suggestions that resolve this issue by rewriting the SQL.
Currently, we'd like to consider this as the expected behavior and close this issue later. Feel free to comment or reopen if you have further questions.
1. Minimal reproduce step (Required)
Firstly, execute
init.sql
to create the table. Then executingerror.sql
yields unexpected results. Note that reproducing these results might not be entirely stable. Typically, it can be completed within three attempts. You can try executingerror.sql
multiple times or executeinit.sql
again to rebuild the table.init.sql.txt error.sql.txt
2. What did you expect to see? (Required)
The SQL statement applies
nullif
to two identical expressionsselect variance(c_k6tgklu10) from t_rl02
, then counts the results. Since the result is NULL, the count should be 0.3. What did you see instead (Required)
In the multi-node version, the count is 1. output_re_main2.log output_re_single2.log
4. What is your TiDB version? (Required)
topology:
distributed.yaml:
single.yaml
about us
We are the BASS team from the School of Cyber Science and Technology at Beihang University. Our main focus is on system software security, operating systems, and program analysis research, as well as the development of automated program testing frameworks for detecting software defects. Using our self-developed database vulnerability testing tool, we have identified the above-mentioned vulnerabilities in TiDB that may lead to database logic error.