StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
9.27k stars 1.84k forks source link

ConcurrentModificationException when query during drop partition #53312

Open monkeyboy123 opened 5 days ago

monkeyboy123 commented 5 days ago

Steps to reproduce the behavior (Required)


1. CREATE TABLE `test_reproduce` (
  `aa` varchar(64) NOT NULL COMMENT "", 
  `create_date` datetime NOT NULL COMMENT "创建时间", 
  `update_date` datetime NOT NULL COMMENT "更新时间"
) ENGINE = OLAP PRIMARY KEY(
  `aa`, `create_date`, 
) 
 PARTITION BY date_trunc('month', create_date) DISTRIBUTED BY HASH(`aa`) 
PROPERTIES (
    "replication_num" = "3", 
    "in_memory" = "false", 
    "enable_persistent_index" = "true", 
    "replicated_storage" = "true", "partition_live_number" = "1", 
    "compression" = "LZ4"
  );
2. insert some data into test_reproduce
3. select * from test_reproduce

Expected behavior (Required)

some data.

Real behavior (Required)

java.util.ConcurrentModificationException: null
    at java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719) ~[?:?]
    at java.util.LinkedHashMap$LinkedValueIterator.next(LinkedHashMap.java:746) ~[?:?]
    at com.starrocks.sql.optimizer.statistics.StatisticsCalcUtils.deltaRows(StatisticsCalcUtils.java:176) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.optimizer.statistics.StatisticsCalcUtils.getTableRowCount(StatisticsCalcUtils.java:114) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.optimizer.statistics.StatisticsCalculator.computeOlapScanNode(StatisticsCalculator.java:257) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.optimizer.statistics.StatisticsCalculator.visitLogicalOlapScan(StatisticsCalculator.java:225) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.optimizer.statistics.StatisticsCalculator.visitLogicalOlapScan(StatisticsCalculator.java:161) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.optimizer.operator.logical.LogicalOlapScanOperator.accept(LogicalOlapScanOperator.java:149) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.optimizer.statistics.StatisticsCalculator.estimatorStats(StatisticsCalculator.java:177) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.optimizer.task.DeriveStatsTask.execute(DeriveStatsTask.java:57) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.optimizer.task.SeriallyTaskScheduler.executeTasks(SeriallyTaskScheduler.java:69) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.optimizer.Optimizer.memoOptimize(Optimizer.java:595) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.optimizer.Optimizer.optimizeByCost(Optimizer.java:201) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.optimizer.Optimizer.optimize(Optimizer.java:134) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.StatementPlanner.createQueryPlan(StatementPlanner.java:146) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.StatementPlanner.planQuery(StatementPlanner.java:121) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:92) ~[starrocks-fe.jar:?]
    at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:61) ~[starrocks-fe.jar:?]
    at com.starrocks.qe.StmtExecutor.execute(StmtExecutor.java:456) ~[starrocks-fe.jar:?]
    at com.starrocks.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:392) ~[starrocks-fe.jar:?]
    at com.starrocks.qe.ConnectProcessor.dispatch(ConnectProcessor.java:506) ~[starrocks-fe.jar:?]
    at com.starrocks.qe.ConnectProcessor.processOnce(ConnectProcessor.java:782) ~[starrocks-fe.jar:?]
    at com.starrocks.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:69) ~[starrocks-fe.jar:?]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
    at java.lang.Thread.run(Thread.java:829) ~[?:?]

StarRocks version (Required)

murphyatwork commented 5 days ago

what's the exact version of SR you're using ?

monkeyboy123 commented 5 days ago

what's the exact version of SR you're using ?

3.1.7 , but i think the other 3.x version also hints this.

murphyatwork commented 1 day ago

what's the exact version of SR you're using ?

3.1.7 , but i think the other 3.x version also hints this.

it's not as expected, because for a query we use a kind of optimistic concurrency control to copy the Table structure to avoid concurrent update. so we may need to dive into this exception to figure out the root cause.

murphyatwork commented 1 day ago

we fixed several related bugs after 3.1.7, could you upgrade the latest version and test again ?

monkeyboy123 commented 1 day ago

we fixed several related bugs after 3.1.7, could you upgrade the latest version and test again ?

I got it, but can i fix it by take another pr in version 3.1.7?

murphyatwork commented 1 day ago

@monkeyboy123 I would suggest blame related modifications around the com.starrocks.sql.analyzer.AnalyzerUtils.OlapTableCollector, it's used to copy the Table data structure before optimizer.

monkeyboy123 commented 1 day ago

@monkeyboy123 I would suggest blame related modifications around the com.starrocks.sql.analyzer.AnalyzerUtils.OlapTableCollector, it's used to copy the Table data structure before optimizer.

@murphyatwork Thanks , this is another pr-53441,if you have time, please check it