StarRocks / starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
https://starrocks.io
Apache License 2.0
8.65k stars 1.75k forks source link

The query return wrong result. #5201

Closed yongbingwang closed 2 years ago

yongbingwang commented 2 years ago

Steps to reproduce the behavior (Required)

  1. CREATE TABLE '...'
    CREATE TABLE `duplicate_table_with_null` ( `k1`  date, `k2`  datetime, `k3`  char(20), `k4`  varchar(20), `k5`  boolean, `k6`  tinyint, `k7`  smallint, `k8`  int, `k9`  bigint, `k10` largeint, `k11` float, `k12` double, `k13` decimal(27,9) ) ENGINE=OLAP DUPLICATE KEY(`k1`, `k2`, `k3`, `k4`, `k5`) COMMENT "OLAP" DISTRIBUTED BY HASH(`k1`, `k2`, `k3`) BUCKETS 3 PROPERTIES ( "replication_num" = "1", "storage_format" = "v2" );
  2. INSERT INTO '....'
    curl --location-trusted -u root: -T /home/disk1/jenkins/workspace/branch_release/StarRocksTest/lib/../common/data/basic_types_data -XPUT -H label:stream_load_89df374e_bee1_11ec_9ffc_00163e0e489a -H column_separator:  http://xxx:8234/api/test_aggregation_89ddba54_bee1_11ec_9ffc_00163e0e489a/duplicate_table_with_null/_stream_load
  3. SELECT '....'
    
    mysql> select avg(k9) from duplicate_table_with_null;
    +------------------------+
    | avg(k9)                |
    +------------------------+
    | -2.304717109306825e+18 |
    +------------------------+
    1 row in set (0.02 sec)

mysql> select sum(k9)/count(k9) from duplicate_table_with_null; +-------------------------+ | (sum(k9)) / (count(k9)) | +-------------------------+ | 32767.5 | +-------------------------+ 1 row in set (0.03 sec)


### Expected behavior (Required)
32767.5
### Real behavior (Required)
2.304717109306825e+18
### StarRocks version (Required)
 - You can get the StarRocks version by executing SQL `select current_version()`
 - 2f51014
ZiheLiu commented 2 years ago

The result of select sum(k9)/count(k9) is overflow, because select sum(k9) returns int64, but the real result of sum(k9) is less than INT64_MIN.

-- Overflow.
mysql> select sum(k9) from duplicate_table_with_null;
+------------+
| sum(k9)    |
+------------+
| 2147450880 |
+------------+

- - Real result.
mysql> select sum(cast(k9 as largeint)) from duplicate_table_with_null;
+---------------------------+
| sum(CAST(k9 AS LARGEINT)) |
+---------------------------+
| -604462909807312439902208 |
+---------------------------+

mysql>  select sum(cast(k9 as largeint))/count(k9) from duplicate_table_with_null;
+-------------------------------------------+
| (sum(CAST(k9 AS LARGEINT))) / (count(k9)) |
+-------------------------------------------+
|                    -9.223372036854743e+18 |
+-------------------------------------------+

However, there is still a bug for avg(k9), because avg(k9) returns double and shouldn't be overflow. But it returns 2.304717109306825e+18 instead of -9.223372036854743e+18.

ZiheLiu commented 2 years ago

Fixed by #5325.