pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.2k stars 5.84k forks source link

Different div precision between tidb and tiflash #55294

Open pcqz opened 2 months ago

pcqz commented 2 months ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

create table t(id int, num decimal);
insert into t values(1,100);
alter table t set tiflash replica 1;
select /*+ read_from_storage(tikv[t]) */  length(sum(num)/10) from t group by id;
select /*+ read_from_storage(tiflash[t]) */  length(sum(num)/10) from t group by id;

2. What did you expect to see? (Required)

The div result in tidb has the same length with tiflash.

3. What did you see instead (Required)

mysql> select /*+ read_from_storage(tikv[t]) */  length(sum(num)/10) from t group by id;
+---------------------+
| length(sum(num)/10) |
+---------------------+
|                  12 |
+---------------------+
1 row in set (0.00 sec)

mysql> select /*+ read_from_storage(tiflash[t]) */  length(sum(num)/10) from t group by id;
+---------------------+
| length(sum(num)/10) |
+---------------------+
|                   7 |
+---------------------+
1 row in set (0.04 sec)

mysql> select /*+ read_from_storage(tikv[t]) */ sum(num)/10 from t group by id union select /*+ read_from_storage(tiflash[t]) */ sum(num)/10 from t group by id;
+-------------+
| sum(num)/10 |
+-------------+
|     10.0000 |
|     10.0000 |
+-------------+
2 rows in set (0.02 sec)

4. What is your TiDB version? (Required)

v7.1.5, master

windtalker commented 2 months ago

The root cause of this issue is that in TiDB/MySQL, the schema of decimal div is different in runtime and compile time. the schema of sum(num)/10 is Decimal(37,4) at compile time, but during runtime the schema is something like Decimal(42,9), while in TiFlash the schema is exactaly Decimal(37,4), so in the union, the input is 10.000000000(from TiDB) and 10.0000(from TiFlash), then the output is 10.000000000 and 10.0000. At the end of the query, the decimal 10.000000000 will be convert to compile time schema, which is Decimal(37,4), so the output is two 10.0000. In fact, in TiDB you can easily proceduce semilar cases even without TiFlash, like

mysql> select 12346/123456789 union select 12345/123456789;
+-----------------+
| 12346/123456789 |
+-----------------+
|          0.0001 |
|          0.0001 |
+-----------------+
2 rows in set (0.00 sec)

It is not easy to fix since it is kind of a feature in MySQL/TiDB. As a workaround, you can use cast to align the runtime schema with the compile time schema explicitly, then the union will only return 1 rows:

mysql> select /*+ read_from_storage(tikv[t]) */ cast(sum(num)/10 as decimal(37,4)) from t group by id union select /*+ read_from_storage(tiflash[t]) */ sum(num)/10 from t group by id;
+------------------------------------+
| cast(sum(num)/10 as decimal(37,4)) |
+------------------------------------+
|                            10.0000 |
+------------------------------------+
1 row in set (0.05 sec)