apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
7.79k stars 1.74k forks source link

[Bug] [Zeta] TaskGroup Failed MSG lost when job restore time more than one #7239

Closed EricJoy2048 closed 1 month ago

EricJoy2048 commented 1 month ago

Search before asking

What happened

If the Pipeline fails and the restore is successful, the next time the task fails, the fault cause of the task is the same as that of the first failed task, and the latest failure cause is lost.

SeaTunnel Version

2.3.5 dev

SeaTunnel Config

*

Running Command

*

Error Exception

2024-07-17 10:27:43,088 ERROR [o.a.s.e.s.d.p.PhysicalVertex  ] [hz.main.generic-operation.thread-2] - Job 实时.ods_hzy_pos_cxj (841136001937047570), Pipeline: [(1/1)], task: [pipeline-1 [Source[0]-MySQL-CDC]-SourceTask (1/2)] end with state FAILED and Exception: java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: org.apache.seatunnel.connectors.doris.exception.DorisConnectorException: ErrorCode:[Doris-01], ErrorDescription:[stream load error] - stream load error: [INTERNAL_ERROR]cancelled: [INTERNAL_ERROR]cancelled: Process has no memory available, cancel top memory used load: load memory tracker <Load#Id=e341448d2b6ec8b9-a059d67d96356bb8> consumption 31.19 MB, backend 10.3.215.114 process memory used 228.96 GB exceed limit 222.98 GB or sys available memory 15.46 GB less than low water mark 1.60 GB. Execute again after enough memory, details see be.INFO.

Zeta or Flink or Spark Version

No response

Java or Scala Version

No response

Screenshots

No response

Are you willing to submit PR?

Code of Conduct