The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
streaming load data to StarRocks by flink-doris-connector
some message
nums of task: 7
qps for all tasks: 20000
nums of be: 7
memory of every be: 60G
Expected behavior (Required)
Disk io is evenly distributed
The task is running normally
Real behavior (Required)
Only two machines are working
Tasks fail frequently and restart
Caused by: org.apache.flink.util.SerializedThrowable: Writing records to Doris failed.
at org.apache.flink.connector.doris.sink.DorisSinkFunction.checkFlushException(DorisSinkFunction.java:307) ~[flink-connector-doris_2.11-momo-1.14.3.jar:momo-1.14.3]
at org.apache.flink.connector.doris.sink.DorisSinkFunction.flush(DorisSinkFunction.java:172) ~[flink-connector-doris_2.11-momo-1.14.3.jar:momo-1.14.3]
at org.apache.flink.connector.doris.sink.DorisSinkFunction.lambda$open$0(DorisSinkFunction.java:105) ~[flink-connector-doris_2.11-momo-1.14.3.jar:momo-1.14.3]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_121]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[?:1.8.0_121]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_121]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[?:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_121]
... 1 more
Caused by: org.apache.flink.util.SerializedThrowable: java.io.IOException: Doris Stream load failed, load result={
"TxnId": 22245628,
"Label": "81aef16f-fe94-44ce-9167-2fa477abda17_1664262491507",
"Status": "Publish Timeout",
"Message": "Publish timeout. The data will be visible after a while",
"NumberTotalRows": 31,
"NumberLoadedRows": 31,
"NumberFilteredRows": 0,
"NumberUnselectedRows": 0,
"LoadBytes": 14156,
"LoadTimeMs": 7714,
"BeginTxnTimeMs": 121,
"StreamLoadPutTimeMs": 35,
"ReadDataTimeMs": 0,
"WriteDataTimeMs": 56,
"CommitAndPublishTimeMs": 0
}
at org.apache.flink.connector.doris.sink.DorisSinkFunction.flush(DorisSinkFunction.java:187) ~[flink-connector-doris_2.11-momo-1.14.3.jar:momo-1.14.3]
at org.apache.flink.connector.doris.sink.DorisSinkFunction.lambda$open$0(DorisSinkFunction.java:105) ~[flink-connector-doris_2.11-momo-1.14.3.jar:momo-1.14.3]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_121]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[?:1.8.0_121]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_121]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[?:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_121]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_121]
... 1 more
### StarRocks version (Required)
- You can get the StarRocks version by executing SQL `select current_version()`
- 2.2.0
Steps to reproduce the behavior (Required)
create agg model, partition by stat_date and distributed by key, all aggregate keys are a small number of dimension candidates
streaming load data to StarRocks by flink-doris-connector
some message
Expected behavior (Required)
Real behavior (Required)
Only two machines are working
Tasks fail frequently and restart
Caused by: org.apache.flink.util.SerializedThrowable: java.io.IOException: Doris Stream load failed, load result={ "TxnId": 22245628, "Label": "81aef16f-fe94-44ce-9167-2fa477abda17_1664262491507", "Status": "Publish Timeout", "Message": "Publish timeout. The data will be visible after a while", "NumberTotalRows": 31, "NumberLoadedRows": 31, "NumberFilteredRows": 0, "NumberUnselectedRows": 0, "LoadBytes": 14156, "LoadTimeMs": 7714, "BeginTxnTimeMs": 121, "StreamLoadPutTimeMs": 35, "ReadDataTimeMs": 0, "WriteDataTimeMs": 56, "CommitAndPublishTimeMs": 0 } at org.apache.flink.connector.doris.sink.DorisSinkFunction.flush(DorisSinkFunction.java:187) ~[flink-connector-doris_2.11-momo-1.14.3.jar:momo-1.14.3] at org.apache.flink.connector.doris.sink.DorisSinkFunction.lambda$open$0(DorisSinkFunction.java:105) ~[flink-connector-doris_2.11-momo-1.14.3.jar:momo-1.14.3] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_121] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[?:1.8.0_121] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_121] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[?:1.8.0_121] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_121] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_121] ... 1 more