StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
9.23k stars 1.83k forks source link

Incorrect Query Result With Substr Partition #48335

Open ShaoxunLi opened 4 months ago

ShaoxunLi commented 4 months ago

Steps to reproduce the behavior (Required)

  1. CREATE TABLE bill_detail ( bill_code varchar(200) NOT NULL DEFAULT "" COMMENT "" ) ENGINE=OLAP PRIMARY KEY(bill_code) PARTITION BY RANGE(cast(substr(bill_code, 3, 13) as bigint)) (PARTITION p1 VALUES [('0'), ('5000000')), PARTITION p2 VALUES [('5000000'), ('10000000')), PARTITION p3 VALUES [('10000000'), ('15000000')), PARTITION p4 VALUES [('15000000'), ('20000000')), PARTITION p999 VALUES[('2921712368983'), ('2921712368985')) ) DISTRIBUTED BY HASH(bill_code) BUCKETS 10;

  2. INSERT INTO bill_detail VALUES(5000000), (9000000), (6000000), (15000001);

  3. SELECT * FROM bill_detail WHERE bill_code > 1;

Expected behavior (Required)

show all data;

Real behavior (Required)

The query result is as follows:

image

StarRocks version (Required)

3.1.10

ShaoxunLi commented 4 months ago

The data is improperly filtered through partitions.

ShaoxunLi commented 4 months ago

The PR is used as a reference:https://github.com/StarRocks/starrocks/pull/23059 Is this the expected result? Thanks! @Astralidea

chenminghua8 commented 4 months ago

This feature is not used in this way. It is mainly used to perform range partitioning on string columns with prefixes. The data you insert should have a prefix (such as INSERT INTO bill_detail VALUES('FX5000000'), ('FX9000000'), ('FX6000000'), ('FX15000001');)

ShaoxunLi commented 4 months ago

This feature is not used in this way. It is mainly used to perform range partitioning on string columns with prefixes. The data you insert should have a prefix (such as INSERT INTO bill_detail VALUES('FX5000000'), ('FX9000000'), ('FX6000000'), ('FX15000001');)

image