apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
8.01k stars 1.82k forks source link

When importing large data from OceanBase to Doris through the seatunnel, the task stops #5898

Open jeana-zxy opened 11 months ago

jeana-zxy commented 11 months ago

Search before asking

What happened

When importing large data (14943611 rows) from OceanBase to Doris through the seatunnel, the task stops. But when the data is less than 1000 rows, it will write successfully. doris version: doris-2.0.1-rc04-30d35c4 oceanbase version: 4.2.1

SeaTunnel Version

seatunnel-2.3.2

SeaTunnel Config

seatunel config 
env {
  executionn.parallelism = 3
  job.mode = "BATCH"
  checkpoint.interval = 1000
}

source {
    Jdbc {
        url = "jdbc:mysql://xxxxxxxx:2883/saas_data?serverTimezone=Asia/Shanghai"
        driver = "com.mysql.cj.jdbc.Driver"
        connection_check_timeout_sec = 300
        user = "root@xxx#xxx"
        password = "xxx"
        parallelism = 1
        fetch_size = 10000
        result_table_name = "xxx"
        query = "select * from test where pt = '20231119'"
    }
}

sink {
    Doris {
        fenodes = "172.16.0.42:8030"
        username = root
        password = "xxx"
        table.identifier = "saas_data.test"
        sink.enable-2pc = "false"
        sink.label-prefix = "test_json"
        doris.config = {
            format="json"
            read_json_by_line="true"
        }
    }

}

Running Command

seatunnel.sh -c v2.batch.config.template

Error Exception

i have not found any error log

Zeta or Flink or Spark Version

No response

Java or Scala Version

No response

Screenshots

1700642016864

Are you willing to submit PR?

Code of Conduct

XiaoGerGer commented 10 months ago

I also encountered the same situation when I used Doris Sink, the progress bar of the console slowly became 0 and then remained stuck at 0. And when I open the DEBUG log, I find that the output StreamLoad Result statistics are not consistent with the number of actually imported ones, and the number of threads in the container keeps growing 0566b30e589673e8109a9258e62aaa2 cc73ef35f832ead9c2c4df7c8b4c768

XiaoGerGer commented 10 months ago

Even when I try to use Doris sink, SeaTunnel does not report an error even though the Doris table does not exist, it just keeps getting stuck in the progress bar 0 and the background thread keeps growing