apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
8.06k stars 1.83k forks source link

[Bug] [seatunnel-connector-flink-druid] druid导出数据到doris #2893

Closed enterwhat closed 2 years ago

enterwhat commented 2 years ago

Search before asking

What happened

在2.1.3的tag基础上进行了如下更改

feat:

druid导出数据到doris遇到如下问题,并解决:

1,解决-i参数不能传入替换conf的配置文件的source部分占位符的问题,像sql部分一样支持占位符替换。

2,解决druid的start_date,end_date通过命令行-i传参空格,识别成shell命令行的参数分隔符导致yyyy-MM-dd HH:mm:ss的日期格式参数不能被整体识别,导致运行出错的问题,默认使用特定占位符,可自定义

3,对druid的输出分隔符进行自定义,以便支持导入到doris

4,对doris的默认分隔符换成以解决列值里面" "(\t)有影响列识别的问题

5,druid添加user,password的参数认证支持

6,对druid connector能够依赖seatunnel的包进行provided标注

SeaTunnel Version

tag 2.1.3

SeaTunnel Config

env {
  execution.parallelism = 1
  execution.planner = blink
  execution.checkpoint.interval = 5000
  execution.checkpoint.data-uri = "hdfs://master/flink/checkpoints/xxxx"
}

source {
  DruidSource {
      jdbc_url = "jdbc:avatica:remote:url=http://xxx.xxx.xx.xxx:xxxx/druid/v2/sql/avatica/"
      result_table_name = "xxx"
      user = "xxx"
      password = "xxx"
      datasource = "xxx"
      start_date = ${start_date}
      end_date = ${end_date}  
      columns = ["__time"]
  }
}

transform {
  Sql {
sql = "select __time as `time` from xxx"
  }
}

sink {
  DorisSink {
      fenodes = "xxx:xxx"
      database = "xxx"
      table = "xxx"
      user = "xxx"
      password = "xxx"
      batch_size = 100000
      doris.column_separator = "\\x01"
      doris.escape_delimiters = "true"
      doris.columns="time"
  }

}

Running Command

./bin/start-seatunnel-flink.sh -m yarn-cluster --config /home/mgr/apache-seatunnel-incubating-2.1.3/config/flink.batch.heartbeat-druid.conf -i start_date=2022-04-10 13:00:00 -i end_date=2022-04-10 14:00:00

Error Exception

1,-i参数不能传入替换conf的配置文件的source部分占位符的问题,像sql部分一样支持占位符替换。

2,druid的start_date,end_date通过命令行-i传参空格,识别成shell命令行的参数分隔符导致yyyy-MM-dd HH:mm:ss的日期格式参数不能被整体识别,导致运行出错的问题,默认使用特定占位符,可自定义

3,对doris的默认分隔符列值里面" "(\t)有影响列识别的问题

4,druid不支持user,password的参数认证

Flink or Spark Version

1.13.6

Java or Scala Version

2.11

Screenshots

image

Are you willing to submit PR?

Code of Conduct

Hisoka-X commented 2 years ago

Wonderful! Looking forward your PR.

CalvinKirs commented 2 years ago

Hi, thanks for your contribution, could you use English?

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] commented 2 years ago

This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.