StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
8.93k stars 1.79k forks source link

mv partition union rewrite failed #51668

Open leiyang0324 opened 1 month ago

leiyang0324 commented 1 month ago

Desc

When the time partition in the WHERE clause of a query is a proper subset of the materialized view (MV) partitions, the materialized view partition UNION rewrite will fail. The materialized view partition UNION rewrite will only succeed when the WHERE clause of the query includes all the partitions of the materialized view.

Steps to reproduce the behavior (Required)

  1. CREATE TABLE '...' CREATE database mv_test; use mv_test;

    CREATE TABLE area_dim_table ( area_code bigint(20) NOT NULL COMMENT "", area_name varchar(256) NOT NULL COMMENT "" ) ENGINE=OLAP DUPLICATE KEY(area_code) DISTRIBUTED BY HASH(area_code) PROPERTIES ( "replication_num" = "1", "in_memory" = "false", "storage_format" = "DEFAULT", "enable_persistent_index" = "false", "unique_constraints" = "area_code", "compression" = "LZ4" );

CREATE TABLE shop_dim_table ( shop_code bigint(20) NOT NULL COMMENT "", shop_name varchar(256) NOT NULL COMMENT "" ) ENGINE=OLAP DUPLICATE KEY(shop_code) DISTRIBUTED BY HASH(shop_code) PROPERTIES ( "replication_num" = "1", "in_memory" = "false", "storage_format" = "DEFAULT", "enable_persistent_index" = "false", "unique_constraints" = "shop_code", "compression" = "LZ4" );

CREATE TABLE fact_table ( shop_code bigint(20) NOT NULL COMMENT "", dim_public_date_info_date date NULL COMMENT "", area_code bigint(20) NOT NULL COMMENT "", measure_num bigint(20) NOT NULL COMMENT "" ) ENGINE=OLAP DUPLICATE KEY(shop_code) PARTITION BY RANGE(dim_public_date_info_date) (PARTITION p20240913 VALUES [("2024-09-13"), ("2024-09-14")), PARTITION p20240914 VALUES [("2024-09-14"), ("2024-09-15")), PARTITION p20240915 VALUES [("2024-09-15"), ("2024-09-16")), PARTITION p20240916 VALUES [("2024-09-16"), ("2024-09-17"))) DISTRIBUTED BY HASH(shop_code) PROPERTIES ( "replication_num" = "1", "dynamic_partition.enable" = "true", "dynamic_partition.time_unit" = "DAY", "dynamic_partition.time_zone" = "Asia/Shanghai", "dynamic_partition.start" = "-3", "dynamic_partition.end" = "3", "dynamic_partition.prefix" = "p", "dynamic_partition.history_partition_num" = "0", "in_memory" = "false", "storage_format" = "DEFAULT", "enable_persistent_index" = "false", "foreign_key_constraints" = "(shop_code) REFERENCES default_catalog.mv_test.shop_dim_table(shop_code);(area_code) REFERENCES default_catalog.mv_test.area_dim_table(area_code)", "compression" = "LZ4" );

  1. INSERT INTO '....' insert into area_dim_table values(1,'区域1'); insert into area_dim_table values(2,'区域2');

insert into shop_dim_table values(1,'门店1'); insert into shop_dim_table values(2,'门店2'); insert into shop_dim_table values(3,'门店3');

insert into fact_table values(1,'20241009',1,1); insert into fact_table values(2,'20241009',2,1); insert into fact_table values(3,'20241009',2,2); insert into fact_table values(1,'20241010',1,1); insert into fact_table values(2,'20241010',2,1); insert into fact_table values(3,'20241010',2,2);

  1. CREATE and refresh MV CREATE MATERIALIZED VIEW test_fact_all_join_mv DISTRIBUTED BY HASH(shop_code) PARTITION BY(dim_public_date_info_date) REFRESH DEFERRED MANUAL AS SELECT t0.dim_public_date_info_date,t0.shop_code,t0.area_code,t1.area_name,t2.shop_name,sum(measure_num) from fact_table AS t0 left join area_dim_table AS t1 on t0.area_code=t1.area_code left join shop_dim_table AS t2 on t0.shop_code=t2.shop_code group by t0.dim_public_date_info_date,t0.shop_code,t0.area_code,t1.area_name,t2.shop_name; REFRESH MATERIALIZED VIEW test_fact_all_join_mv;
  2. insert into insert into fact_table values(1,'20241011',1,1); insert into fact_table values(2,'20241011',2,1); insert into fact_table values(3,'20241011',2,2); 5.SELECT -- rewrite失败 TRACE REWRITE SELECT t0.dim_public_date_info_date,t0.shop_code,t0.area_code,t1.area_name,t2.shop_name,sum(measure_num) from fact_table AS t0 left join area_dim_table AS t1 on t0.area_code=t1.area_code left join shop_dim_table AS t2 on t0.shop_code=t2.shop_code where t0.dim_public_date_info_date in ('20241009','20241011') group by t0.dim_public_date_info_date,t0.shop_code,t0.area_code,t1.area_name,t2.shop_name;

explain SELECT t0.dim_public_date_info_date,t0.shop_code,t0.area_code,t1.area_name,t2.shop_name,sum(measure_num) from fact_table AS t0 left join area_dim_table AS t1 on t0.area_code=t1.area_code left join shop_dim_table AS t2 on t0.shop_code=t2.shop_code where t0.dim_public_date_info_date in ('20241009','20241011') group by t0.dim_public_date_info_date,t0.shop_code,t0.area_code,t1.area_name,t2.shop_name; --rewrite成功 TRACE REWRITE SELECT t0.dim_public_date_info_date,t0.shop_code,t0.area_code,t1.area_name,t2.shop_name,sum(measure_num) from fact_table AS t0 left join area_dim_table AS t1 on t0.area_code=t1.area_code left join shop_dim_table AS t2 on t0.shop_code=t2.shop_code where t0.dim_public_date_info_date in ('20241009','20241010','20241010') group by t0.dim_public_date_info_date,t0.shop_code,t0.area_code,t1.area_name,t2.shop_name;

explain SELECT t0.dim_public_date_info_date,t0.shop_code,t0.area_code,t1.area_name,t2.shop_name,sum(measure_num) from fact_table AS t0 left join area_dim_table AS t1 on t0.area_code=t1.area_code left join shop_dim_table AS t2 on t0.shop_code=t2.shop_code where t0.dim_public_date_info_date in ('20241009','20241010','20241010') group by t0.dim_public_date_info_date,t0.shop_code,t0.area_code,t1.area_name,t2.shop_name;

Expected behavior (Required)

Query has already been successfully rewritten by: test_fact_all_join_mv.

Real behavior (Required)

Query cannot be rewritten, please check the trace logs to find more information.

StarRocks version (Required)

LiShuMing commented 3 weeks ago

Starrocks v3.3 has optimized union rewrite this a lot, can you try the newest version of StarRocks v3.3?