pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
36.88k stars 5.8k forks source link

query gets incorrect result when using binary type #50296

Closed wjhuang2016 closed 1 month ago

wjhuang2016 commented 8 months ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

CREATE TABLE `t2592d2ba` (
  `col_1` int(10) unsigned NOT NULL DEFAULT '4218867186',
  `col_2` tinyint(1) NOT NULL,
  UNIQUE KEY `idx_1` (`col_1`),
  PRIMARY KEY (`col_1`) /*T![clustered_index] NONCLUSTERED */,
  UNIQUE KEY `idx_3` (`col_1`,`col_2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci
PARTITION BY HASH (`col_1`) PARTITIONS 6;
INSERT INTO `t2592d2ba` VALUES (2448007782,1), (527670686,1), (186019110,1), (2717803868,1), (1298903873,1), (898070011,1), (546731633,1), (1862079368,1), (1308738411,1), (2642905701,1), (2174574170,1), (3671727434,1), (3241957545,1), (2903973206,1), (4263152544,1), (482369162,1), (3287475707,1), (1079500981,1), (3924782077,1), (2172379159,1), (4282612844,1), (4219193657,1), (4085565346,1), (4176141291,0), (365718721,0), (4118127292,0), (3949012407,0), (848225649,0), (548783475,0), (3582322467,0), (2640690466,1), (1277880725,0), (2066827082,0), (795537339,1), (2658290493,1), (2467967102,1), (1042078725,0), (2184825916,1), (3052066128,0), (144842590,1), (2194698418,1), (3851266868,1), (2663359401,1), (2893111043,0), (839415446,0), (4041645171,0), (1182226140,0), (1681827205,1), (2857994129,0), (402899433,1), (2577971043,0), (4119144277,1), (546569302,1), (4290939649,0), (2,1), (3849318389,0), (1307165944,0), (3878890017,1), (184805261,1), (2856342507,1), (2147483648,1), (3014668728,1), (311357370,1), (3286555982,1), (4138590712,1), (1578146154,1), (947522718,0), (1151828680,0), (3000208780,1), (3177749430,1), (1074938063,1), (4165240487,1), (696995924,1), (3798067517,1), (3513513411,0), (1488231769,1), (0,1);

SELECT  WEIGHT_STRING(`t2592d2ba`.`col_2`) is null as col_12051 FROM `t2592d2ba` WHERE `t2592d2ba`.`col_2` = 0;

2. What did you expect to see? (Required)

`col_12051 is null` should be zero.

3. What did you see instead (Required)

mysql> SELECT  WEIGHT_STRING(`t2592d2ba`.`col_2`) is null as col_12051 FROM `t2592d2ba` WHERE `t2592d2ba`.`col_2` = 0;
+-----------+
| col_12051 |
+-----------+
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
+-----------+
23 rows in set, 1 warning (0.00 sec)

mysql> SELECT  SUM(WEIGHT_STRING(`t2592d2ba`.`col_2`) is null) as col_12051 FROM `t2592d2ba` WHERE `t2592d2ba`.`col_2` = 0;
+-----------+
| col_12051 |
+-----------+
|         0 |
+-----------+
1 row in set, 1 warning (0.01 sec)

4. What is your TiDB version? (Required)

56c619f1f5ea8e3f9970fe664729074b5123683a

XuHuaiyu commented 8 months ago
tidb:4000 [test]> SELECT  WEIGHT_STRING(`t2592d2ba`.`col_2`) is null as col_12051 FROM `t2592d2ba` where col_1 = 2448007782;
+-----------+
| col_12051 |
+-----------+
|         1 |
+-----------+
1 row in set, 1 warning (0.00 sec)

tidb:4000 [test]> desc SELECT  WEIGHT_STRING(`t2592d2ba`.`col_2`) is null as col_12051 FROM `t2592d2ba` where col_1 = 2448007782;
+-------------------+---------+------+-------------------------------+-----------------------------------------------------------------------------+
| id                | estRows | task | access object                 | operator info                                                               |
+-------------------+---------+------+-------------------------------+-----------------------------------------------------------------------------+
| Projection_6      | 1.00    | root |                               | isnull(weight_string(cast(test.t2592d2ba.col_2, var_string(20))))->Column#3 |
| └─Point_Get_7     | 1.00    | root | table:t2592d2ba, partition:p0 | handle:2448007782                                                           |
+-------------------+---------+------+-------------------------------+-----------------------------------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
tidb:4000 [test]> SELECT  WEIGHT_STRING(cast(`t2592d2ba`.`col_2` as char(20))) is null as col_12051 FROM `t2592d2ba` where col_1 = 2448007782;
+-----------+
| col_12051 |
+-----------+
|         0 |
+-----------+
1 row in set, 1 warning (0.00 sec)

tidb:4000 [test]> desc SELECT  WEIGHT_STRING(cast(`t2592d2ba`.`col_2` as char(20))) is null as col_12051 FROM `t2592d2ba` where col_1 = 2448007782;
+-------------------+---------+------+-------------------------------+-----------------------------------------------------------------------------+
| id                | estRows | task | access object                 | operator info                                                               |
+-------------------+---------+------+-------------------------------+-----------------------------------------------------------------------------+
| Projection_6      | 1.00    | root |                               | isnull(weight_string(cast(test.t2592d2ba.col_2, var_string(20))))->Column#3 |
| └─Point_Get_7     | 1.00    | root | table:t2592d2ba, partition:p0 | handle:2448007782                                                           |
+-------------------+---------+------+-------------------------------+-----------------------------------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
XuHuaiyu commented 8 months ago
tidb:4000 [test]> SELECT  WEIGHT_STRING(`t2592d2ba`.`col_2`) FROM `t2592d2ba` where col_1 = 2448007782;
+------------------------------------------------------------------------+
| WEIGHT_STRING(`t2592d2ba`.`col_2`)                                     |
+------------------------------------------------------------------------+
| NULL                                                                   |
+------------------------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)

tidb:4000 [test]> SELECT  WEIGHT_STRING(cast(`t2592d2ba`.`col_2` as char(20))) FROM `t2592d2ba` where col_1 = 2448007782;
+------------------------------------------------------------------------------------------------------------+
| WEIGHT_STRING(cast(`t2592d2ba`.`col_2` as char(20)))                                                       |
+------------------------------------------------------------------------------------------------------------+
| 0x1C3E                                                                                                     |
+------------------------------------------------------------------------------------------------------------+
1 row in set, 1 warning (0.00 sec)
XuHuaiyu commented 8 months ago

WEIGHT_STRING() is a debugging function intended for internal use. Its behavior can change without notice between MySQL versions. It can be used for testing and debugging of collations, especially if you are adding a new collation.

The severity of this issue can be marked as major or moderate

XuHuaiyu commented 8 months ago

*expression.builtinWeightStringNullSig is built when executing "SELECT WEIGHT_STRING(t2592d2ba.col_2)"

*expression.builtinWeightStringSig is built when executing "SELECT WEIGHT_STRING(cast(t2592d2ba.col_2 as char(20))) "

kennedy8312 commented 5 months ago

Seeing the same issue on v5.4.0 as well.

solotzg commented 1 month ago

According to the implementation https://github.com/pingcap/tidb/blob/568cc224f7083925686d2212faf7043775542ce0/expression/builtin_string.go#L3688-L3690, weightString always returns null for numberic type. It is a designed behavior rather than a bug.

According to MySQL manual, If the input str is numeric, the result of WEIGHT_STRING() in MySQL is undefined.

cc @XuHuaiyu @bb7133

XuHuaiyu commented 1 month ago

MySQL 5.7

mysql> SELECT  WEIGHT_STRING(`t2592d2ba`.`col_2`) is null as col_12051 FROM `t2592d2ba` WHERE `t2592d2ba`.`col_2` = 0;
+-----------+
| col_12051 |
+-----------+
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
|         1 |
+-----------+
23 rows in set (0.00 sec)

mysql> select weight_string(0);
+------------------+
| weight_string(0) |
+------------------+
| NULL             |
+------------------+
1 row in set (0.00 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.23    |
+-----------+
1 row in set (0.00 sec)
XuHuaiyu commented 1 month ago

We do not plan to fix this issue, as the MySQL documentation does not clearly specify the return value when the parameter type is numeric. TiDB's current implementation is consistent with MySQL 5.7. Additionally, this is a debug function, not a user function that would be used in a production environment, so we will not be making any changes.