apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
7.82k stars 1.76k forks source link

[Bug] [connector-clickhouse] “Distributed('clusterw_7shard','default','user_info')”Character segmentation problem about the local table named user_info #4482

Closed QiMingChina closed 1 year ago

QiMingChina commented 1 year ago

Search before asking

What happened

I use connector-clickhouse to write data to clickhouse,find error below image I checked the source code and found a string segmentation problem

4b07390c5aae29966fa14794d87df76

The replace method in the source code does not remove the brackets from the table name Causes a problem with the subsequent spliced SQL statements

0c5d73943db855361e898ba8466356f

SeaTunnel Version

Seatunnel 2.3.1

SeaTunnel Config

env {
  execution.parallelism = 4
  job.mode = "STREAMING"
  job.name = "test.kafka.2.clickhouse"
}

source {
  Kafka {
    result_table_name = "user_info_kafka"
    schema = {
      fields {
        name = "string"
        age = "int"
        salary = "double"
      }
    }
    format = json
    topic = "kafka-test"
    bootstrap.servers = "XXXX:9092"
    consumer.group = "kafka-clickhouse-test01"
    commit_on_checkpoint = false
    kafka.config = {
      auto.offset.reset = "latest"
      enable.auto.commit = "true"
    }
  }
}

transform {
  Sql {
    source_table_name = "user_info_kafka"
    result_table_name = "clus_user_info"
    query = "select name, age, salary from user_info_kafka"
  }
}

sink {
  Clickhouse {
    host = "XXXX:8123"
    database = "default"
    table = "clus_user_info"
    username = "default"
    password = "XXX"
    split_mode = true
    sharding_key = "name"
  }
}

Running Command

./bin/seatunnel.sh --config ./config/kafka-to-clickhouse-test01.conf -e local

Error Exception

Exception in thread "main" org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:181)
        at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
        at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.connectors.seatunnel.clickhouse.exception.ClickhouseConnectorException: ErrorCode:[API-05], ErrorDescription:[Table not existed] - Cannot get table from clickhouse, resultSet is empty
        at org.apache.seatunnel.connectors.seatunnel.clickhouse.sink.client.ClickhouseProxy.getClickhouseDistributedTable(ClickhouseProxy.java:102)
        at org.apache.seatunnel.connectors.seatunnel.clickhouse.sink.client.ClickhouseProxy.getClickhouseTable(ClickhouseProxy.java:230)
        at org.apache.seatunnel.connectors.seatunnel.clickhouse.sink.client.ClickhouseSink.prepare(ClickhouseSink.java:145)
        at org.apache.seatunnel.engine.core.parse.ConnectorInstanceLoader.loadSinkInstance(ConnectorInstanceLoader.java:90)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.sampleAnalyze(JobConfigParser.java:414)
        at org.apache.seatunnel.engine.core.parse.JobConfigParser.parse(JobConfigParser.java:132)
        at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:112)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.getLogicalDag(JobExecutionEnvironment.java:155)
        at org.apache.seatunnel.engine.client.job.JobExecutionEnvironment.execute(JobExecutionEnvironment.java:147)
        at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:140)

Flink or Spark Version

No response

Java or Scala Version

No response

Screenshots

No response

Are you willing to submit PR?

Code of Conduct

MonsterChenzhuo commented 1 year ago

please assign to me; thx。

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.