StarRocks / starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
https://starrocks.io
Apache License 2.0
8.36k stars 1.69k forks source link

[inverted index]alter gin index column's datatype from varchar to bigint, success #44935

Closed chengqianli-git closed 1 month ago

chengqianli-git commented 2 months ago

Steps to reproduce the behavior (Required)

  1. CREATE TABLE '...'
  2. INSERT INTO '....'
  3. SELECT '....'4.
CREATE TABLE duplicate_table_demo_datatype_not_replicated_all_varchar ( AAA DATETIME not NULL COMMENT "", BBB VARCHAR(200) not NULL COMMENT "", CCC VARCHAR(200) not NULL COMMENT "", DDD VARCHAR(20000) COMMENT "", EEE LARGEINT  NULL COMMENT "", FFF DECIMAL(20,10) NULL COMMENT "", GGG VARCHAR(200)  NULL COMMENT "", HHH FLOAT  NULL COMMENT "", III BOOLEAN  NULL COMMENT "", KKK CHAR(20)   NULL COMMENT "", LLL STRING   NULL COMMENT "", MMM VARCHAR(20)   NULL COMMENT "", NNN BINARY  NULL COMMENT "", OOO TINYINT NULL COMMENT "", PPP DATETIME NULL COMMENT "", QQQ ARRAY<INT> NULL COMMENT "", RRR JSON NULL COMMENT "", SSS MAP<INT,INT> NULL COMMENT "", TTT STRUCT<a INT, b INT> NULL COMMENT "", INDEX init_bitmap_index (KKK) USING BITMAP ) duplicate KEY(AAA, BBB, CCC) PARTITION BY RANGE (`AAA`) ( START ("1970-01-01") END ("2030-01-01") EVERY (INTERVAL 30 YEAR) ) DISTRIBUTED BY HASH(`AAA`, `BBB`) BUCKETS 3 ORDER BY(`AAA`,`BBB`,`CCC`,`DDD`) PROPERTIES ( "replicated_storage"="false", "replication_num" = "3", "storage_format" = "v2", "enable_persistent_index" = "true", "bloom_filter_columns" = "MMM", "unique_constraints" = "GGG" );

CREATE INDEX idx ON duplicate_table_demo_datatype_not_replicated_all_varchar(LLL) USING GIN('parser' = 'english');
SHOW ALTER TABLE COLUMN where tablename='duplicate_table_demo_datatype_not_replicated_all_varchar' ORDER BY JobId DESC LIMIT 1

ALTER TABLE duplicate_table_demo_datatype_not_replicated_all_varchar MODIFY COLUMN LLL bigint;

Expected behavior (Required)

fail

Real behavior (Required)

success

| duplicate_table_demo_datatype_not_replicated_all_varchar | CREATE TABLE `duplicate_table_demo_datatype_not_replicated_all_varchar` (
  `AAA` datetime NOT NULL COMMENT "",
  `BBB` varchar(200) NOT NULL COMMENT "",
  `CCC` varchar(200) NOT NULL COMMENT "",
  `DDD` varchar(20000) NULL COMMENT "",
  `EEE` largeint(40) NULL COMMENT "",
  `FFF` decimal(20, 10) NULL COMMENT "",
  `GGG` varchar(200) NULL COMMENT "",
  `HHH` float NULL COMMENT "",
  `III` boolean NULL COMMENT "",
  `KKK` char(20) NULL COMMENT "",
  `LLL` bigint(20) NULL COMMENT "",
  `MMM` varchar(20) NULL COMMENT "",
  `NNN` varbinary NULL COMMENT "",
  `OOO` tinyint(4) NULL COMMENT "",
  `PPP` datetime NULL COMMENT "",
  `QQQ` array<int(11)> NULL COMMENT "",
  `RRR` json NULL COMMENT "",
  `SSS` map<int(11),int(11)> NULL COMMENT "",
  `TTT` struct<a int(11), b int(11)> NULL COMMENT "",
  INDEX init_bitmap_index (`KKK`) USING BITMAP COMMENT '',
  INDEX idx (`DDD`) USING GIN("parser" = "english") COMMENT '',
  INDEX idx2 (`LLL`) USING GIN("parser" = "english") COMMENT ''
) ENGINE=OLAP
DUPLICATE KEY(`AAA`, `BBB`, `CCC`)
PARTITION BY RANGE(`AAA`)
(PARTITION p1970 VALUES [("1970-01-01 00:00:00"), ("2000-01-01 00:00:00")),
PARTITION p2000 VALUES [("2000-01-01 00:00:00"), ("2030-01-01 00:00:00")))
DISTRIBUTED BY HASH(`AAA`, `BBB`) BUCKETS 3
ORDER BY(`AAA`, `BBB`, `CCC`, `DDD`)
PROPERTIES (
"bloom_filter_columns" = "MMM",
"compression" = "LZ4",
"fast_schema_evolution" = "true",
"replicated_storage" = "false",
"replication_num" = "3",
"unique_constraints" = "default_catalog.debug2.duplicate_table_demo_datatype_not_replicated_all_varchar.GGG"
); |

StarRocks version (Required)

srlch commented 2 months ago

https://github.com/StarRocks/starrocks/pull/44970