StarRocks / starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
https://starrocks.io
Apache License 2.0
8.65k stars 1.75k forks source link

Features not compatible: Automatic Distribution (Bucketing), Colocation, and Automatic Partitioning #29192

Open dustinlineweber opened 1 year ago

dustinlineweber commented 1 year ago

Steps to reproduce the behavior (Required)

  1. Create two colocated tables that: A. Use automatic partitioning B. Use automatic Distribution Bucketing (do NOT specify a BUCKET <number> in the distribution key)

    CREATE TABLE IF NOT EXISTS `colocated_table_1`
    (
    `created_at`      datetime     NOT NULL,
    `a_number`        SMALLINT     NOT NULL
    )
    ENGINE=OLAP
    PRIMARY KEY(`created_at`,`a_number`)
    PARTITION BY date_trunc("day",created_at)
    DISTRIBUTED BY HASH(`a_number`)
    PROPERTIES(
       "colocate_with" = "cg_123456",
       "replication_num" = "1",
       "partition_live_number" = "1095"
    );
    
    CREATE TABLE IF NOT EXISTS `colocated_table_2`
    (
    `created_at`      datetime     NOT NULL,
    `a_number`        SMALLINT     NOT NULL
    )
    ENGINE=OLAP
    PRIMARY KEY(`created_at`,`a_number`)
    PARTITION BY date_trunc("day",created_at)
    DISTRIBUTED BY HASH(`a_number`)
    PROPERTIES(
       "colocate_with" = "cg_123456",
       "replication_num" = "1",
       "partition_live_number" = "1095"
    );
  2. Attempt to insert a row: insert into colocated_table_1 values ("2023-01-01",1);

Expected behavior (Required)

1 row(s) affected {'label':'...', 'status':'VISIBLE', 'txnId':'...'}

An alternative expected behavior: -Documentation mentions this restriction and -Creating a table with this type of definition throws an exception that prevents creation, instead of at time of use

Real behavior (Required)

Error Code: 1064. Insert has filtered data in strict mode, txn_id = 38442 tracking sql = select tracking_log from information_schema.load_tracking_logs where job_id=48957

The result of select tracking_log from information_schema.load_tracking_logs where job_id=48956; is: Error: The row create partition failed since Runtime error: automatic create partition failed. error:Colocate tables must have same bucket num: 0, with group 10007.48944.. Row: [2023-01-01 00:00:00, 1]

StarRocks version (Required)

Notes

It appears to work if you do ANY of the following:

  1. Change the tables to have DISTRIBUTED BY HASH(`a_number`) BUCKETS 16
  2. Change the tables to not use colocation
  3. Change the tables to not use automatic partitioning
Astralidea commented 1 year ago

Thank you for your feedback, we will think about how to solve it better.

github-actions[bot] commented 7 months ago

We have marked this issue as stale because it has been inactive for 6 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to StarRocks!

dustinlineweber commented 7 months ago

It does fail differently now, and in a better place (when creating the second table in my example), but the message makes me worry that when the automatic bucket count DID happen to match, it maybe wouldn't throw this error. An error more specific in this case to make it more clear would be better, and failing when creating the FIRST table in a colocation group that is inherently invalid would be ideal. Maybe something failing when the first table is created like Colocate table would not be valid in this configuration?

[2024-02-12 09:15:30] [42000][1064] Unexpected exception: Colocate tables must have same bucket num: 8, with group 10181.17183943, current info type: HASH; distribution columns: [a_number,];
jaogoy commented 3 months ago

Currently, you could remove the colocate_with parameter.

dustinlineweber commented 3 months ago

Currently, you could remove the colocate_with parameter.

Yep. In my original writeup I included 3 ways to solve it.

It appears to work if you do ANY of the following:

Change the tables to have DISTRIBUTED BY HASH(a_number) BUCKETS 16 Change the tables to not use colocation Change the tables to not use automatic partitioning

But since then it has been revealed this was not something that was supposed to be a valid combination. And an error was even implemented at creation time (awesome), but the error message isn't yet great for describing WHY what you are trying isn't working: https://github.com/StarRocks/starrocks/issues/29192#issuecomment-1938878920

And also it appears there MIGHT be situations it still might get past the check (when the bucket count DOES match).