apache / doris-flink-connector

Flink Connector for Apache Doris
https://doris.apache.org/
Apache License 2.0
330 stars 226 forks source link

[Bug] DorisRuntimeException: stream load error #86

Closed DarvenDuan closed 1 year ago

DarvenDuan commented 1 year ago

Search before asking

Version

doris :1.1.2 flink-doris-connector-1.14_2.12 :1.1.0

What's Wrong?

When a BE node process fails, the flink program cannot recover automatically. The following error occurs:

org.apache.flink.streaming.runtime.tasks.AsynchronousException: Caught exception while processing timer. at org.apache.flink.streaming.runtime.tasks.StreamTask$StreamTaskAsyncExceptionHandler.handleAsyncException(StreamTask.java:1583) at org.apache.flink.streaming.runtime.tasks.StreamTask.handleAsyncException(StreamTask.java:1559) at org.apache.flink.streaming.runtime.tasks.StreamTask.invokeProcessingTimeCallback(StreamTask.java:1704) at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$null$22(StreamTask.java:1693) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:338) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:324) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201) at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:812) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:764) at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:1062) at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:1041) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:857) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:643) at java.lang.Thread.run(Thread.java:748) Caused by: TimerException{org.apache.doris.flink.exception.DorisRuntimeException: stream load error: null} ... 14 more Caused by: org.apache.doris.flink.exception.DorisRuntimeException: stream load error: null at org.apache.doris.flink.sink.committer.DorisCommitter.commitTransaction(DorisCommitter.java:107) at org.apache.doris.flink.sink.committer.DorisCommitter.commit(DorisCommitter.java:71) at org.apache.flink.streaming.runtime.operators.sink.StreamingCommitterHandler.commit(StreamingCommitterHandler.java:54) at org.apache.flink.streaming.runtime.operators.sink.AbstractStreamingCommitterHandler.retry(AbstractStreamingCommitterHandler.java:96) at org.apache.flink.streaming.runtime.operators.sink.AbstractCommitterHandler.retry(AbstractCommitterHandler.java:66) at org.apache.flink.streaming.runtime.operators.sink.CommitRetrier.retry(CommitRetrier.java:80) at org.apache.flink.streaming.runtime.operators.sink.CommitRetrier.lambda$retryAt$0(CommitRetrier.java:63) at org.apache.flink.streaming.runtime.tasks.StreamTask.invokeProcessingTimeCallback(StreamTask.java:1702) ... 13 more

What You Expected?

When one of the Doris cluster's BE fails, the Flink process resumes automatically.

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

Code of Conduct

JNSimba commented 1 year ago

Will automatically restart from checkpoint