StarRocks / starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
https://starrocks.io
Apache License 2.0
8.52k stars 1.73k forks source link

When the Modify the bucketing method and number of buckets is executed, Historical data of non-partitioned tables is preserved #48022

Open yesengh opened 1 month ago

yesengh commented 1 month ago

Steps to reproduce the behavior (Required)

1. partitioned tables -- 1.1 create table

CREATE TABLE IF NOT EXISTS p_orders
(
    create_time DATE   NOT NULL COMMENT "create time of an order",
    order_id    BIGINT NOT NULL COMMENT "id of an order",
    order_state string COMMENT "state of an order",
    total_price BIGINT COMMENT "price of an order"
) UNIQUE KEY(create_time, order_id)
PARTITION BY date_trunc('day', create_time)
DISTRIBUTED BY HASH(order_id) BUCKETS 2
PROPERTIES (
    "datacache.enable" = "true",
    "datacache.partition_duration" = "1 MONTH",
    "enable_async_write_back" = "false"
);

image

-- 1.2 put data

insert into p_orders values (date'2024-04-08', 123456, 'orders', 100);
insert into p_orders values (date'2024-04-08', 1234567, 'orders', 100);
insert into p_orders values (date'2024-04-08', 12345679, 'orders', 100);
insert into p_orders values (date'2024-04-08', 12345678, 'orders', 100);
insert into p_orders values (date'2024-04-07', 123456, 'pay', 98);
insert into p_orders values (date'2024-04-07', 1234567, 'pay', 1)

image

-- 1.3 check data select * from p_orders; image

show partitions from orders; image

select TABLE_NAME, TABLE_ID from information_schema.tables_config
where TABLE_SCHEMA = 'cdm_dev';

image

The data block corresponding to the file system: image image

-- 1.4 Modify the bucketing method and number of buckets ALTER TABLE orders DISTRIBUTED BY HASH(order_id) BUCKETS 1; image

-- 1.5 Check the file system data image image image

-- 1.6 conclusion Only the latest version of the data is retained

2. non-partitioned tables -- 2.1 create table

CREATE TABLE IF NOT EXISTS orders
(
    create_time DATE   NOT NULL COMMENT "create time of an order",
    order_id    BIGINT NOT NULL COMMENT "id of an order",
    order_state string COMMENT "state of an order",
    total_price BIGINT COMMENT "price of an order"
) UNIQUE KEY(create_time, order_id)
DISTRIBUTED BY HASH(order_id) BUCKETS 2
PROPERTIES (
    "datacache.enable" = "true",
    "datacache.partition_duration" = "1 MONTH",
    "enable_async_write_back" = "false"
);

image

-- 2.2 put data

insert into orders values (date'2024-04-08', 123456, 'orders', 100);
insert into orders values (date'2024-04-08', 1234567, 'orders', 100);
insert into orders values (date'2024-04-08', 12345679, 'orders', 100);
insert into orders values (date'2024-04-08', 12345678, 'orders', 100);
insert into orders values (date'2024-04-07', 123456, 'pay', 98);
insert into orders values (date'2024-04-07', 1234567, 'pay', 1);

image

-- 2.3 check data select * from orders; image

show partitions from orders; image

select TABLE_NAME, TABLE_ID from information_schema.tables_config
where TABLE_SCHEMA = 'cdm_dev';

image

The data block corresponding to the file system: image

-- 2.4 Modify the bucketing method and number of buckets ALTER TABLE p_orders DISTRIBUTED BY HASH(order_id) BUCKETS 1; image

-- 2.5 Check the file system data image image image

-- 2.6 conclusion image

Expected behavior (Required)

Type Expected
partitioned tables Delete historical data
non-partitioned Delete historical data

Real behavior (Required)

Type Expected As expected
partitioned tables Delete historical data Y
non-partitioned Delete historical data N

StarRocks version (Required)

Use HDFS for shared-data 3.3.0-19a3f66

yesengh commented 1 month ago

When the Modify the bucketing method and number of buckets is executed, partitioned tables be log receive a request to drop the table, but can't see the request for a table that is non-partitioned tables image

xiangguangyxg commented 1 month ago

Is there any log like "Unknown partition type:" in fe.log ?