apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
7.89k stars 1.79k forks source link

[Bug] [Transform v2] java.lang.UnsupportedOperationException: The Factory has not been implemented and the deprecated Plugin will be used. #6465

Open Aiden-Rose opened 7 months ago

Aiden-Rose commented 7 months ago

Search before asking

What happened

I need to use the spark engine to run the function with the replace function, but 2.3.4 reports an error, and the 2.3.3 version is OK.

SeaTunnel Version

2.3.4

SeaTunnel Config

env {
  #parallelism = 10
  job.mode = "BATCH"
}

source {
    Jdbc {
        url = "jdbc:mysql://xxx.xx.xx.xx:3306/test_ysp?rewriteBatchStatements=true"
        driver = "com.mysql.cj.jdbc.Driver"
        connection_check_timeout_sec = 100
        user = "root"
        password = "root"
        partition_column = "id"
        partition_num = 10
        query = "select * from mock_data_10w_2"
        parallelism = 10
        fetch_size = 1000
        result_table_name = "fake"
    }
}

transform {

   Replace {
     source_table_name = "fake"
     result_table_name = "fake1"
     replace_field = "name"
     pattern = "Yamada"
     replacement = "hahaha"
    #is_regex = true
  }

  Replace {
    source_table_name = "fake1"
    result_table_name = "fake2"
    replace_field = "sex"
    pattern = "男"
    replacement = "女"
    #is_regex = true
  }

}

sink {
     Hive {
        source_table_name = "fake2"
        table_name = "cmn_src.mock_data_10w_2"
        metastore_uri = "thrift://xxxxxx:9083"
        #metastore_uri = "thrift://host-172-24-9-94:9083"
        hdfs_site_path = "/usr/hdp/3.1.4.0-315/hadoop/conf/hdfs-site.xml"
        #hdfs_site_path = "/opt/seatunnel/seatunnel-client/config/2.3.x/hdfs-site.xml"
    }
}

Running Command

sh /opt/bdos/seatunnel/apache-seatunnel-2.3.4/bin/start-seatunnel-spark-2-connector-v2.sh --master local[4] --deploy-mode client --config /opt/bdos/seatunnel/apache-seatunnel-2.3.4/config/test/mysql_data_to_hive4.conf

Error Exception

24/03/07 15:35:23 ERROR SeaTunnel: 
===============================================================================

Exception in thread "main" org.apache.seatunnel.core.starter.exception.CommandExecuteException: Run SeaTunnel on spark failed
        at org.apache.seatunnel.core.starter.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:62)
        at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
        at org.apache.seatunnel.core.starter.spark.SeaTunnelSpark.main(SeaTunnelSpark.java:35)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.seatunnel.core.starter.exception.TaskExecuteException: SeaTunnel transform task: Replace execute error
        at org.apache.seatunnel.core.starter.spark.execution.TransformExecuteProcessor.execute(TransformExecuteProcessor.java:121)
        at org.apache.seatunnel.core.starter.spark.execution.SparkExecution.execute(SparkExecution.java:70)
        at org.apache.seatunnel.core.starter.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:60)
        ... 14 more
Caused by: java.lang.UnsupportedOperationException: The Factory has not been implemented and the deprecated Plugin will be used.
        at org.apache.seatunnel.api.table.factory.TableTransformFactory.createTransform(TableTransformFactory.java:36)
        at org.apache.seatunnel.core.starter.spark.execution.TransformExecuteProcessor.execute(TransformExecuteProcessor.java:108)

Zeta or Flink or Spark Version

spark 2.4.4

Java or Scala Version

jdk1.8

Screenshots

No response

Are you willing to submit PR?

Code of Conduct

liunaijie commented 7 months ago

@Aiden-Rose you can try Sql transform with case when syntax or replace function.

Aiden-Rose commented 7 months ago

Thanks, I solved this problem.

github-actions[bot] commented 6 months ago

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

H-TWINKLE commented 5 months ago

how to fixed it in 2.3.4