apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
7.81k stars 1.76k forks source link

[Bug] [Hive Source] Invalid method name: 'get_table_req' #7330

Open matianhe3 opened 1 month ago

matianhe3 commented 1 month ago

Search before asking

What happened

update to 2.3.6 , hive source error. 2.3.5 ok.

SeaTunnel Version

2.3.6

SeaTunnel Config

source {
  Hive {
    table_name = ""
    metastore_uri = ""
    hdfs_site_path = "/opt/apache-seatunnel/config/hdfs-site.xml"
    hive_site_path = "/opt/apache-seatunnel/config/hive-site.xml"
    result_table_name = source
    read_partitions = ["dt="${dt}]
    delimiter = ","
  }
}

Running Command

seatunnel.sh -c test.conf -i dt=2024-08-06

Error Exception

2024-08-07 11:10:59,057 ERROR [o.a.s.c.s.SeaTunnel           ] [main] - Exception StackTrace:org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
    at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:211)
    at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
    at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.api.table.factory.FactoryException: ErrorCode:[API-06], ErrorDescription:[Factory initialize failed] - Unable to create a source for identifier 'Hive'.
    at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:101)
    at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSource(MultipleTableJobConfigParser.java:361)
    at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:209)
    at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.getLogicalDag(ClientJobExecutionEnvironment.java:114)
    at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.execute(ClientJobExecutionEnvironment.java:182)
    at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:158)
    ... 2 more
Caused by: java.lang.NoSuchMethodError: 'void org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(org.apache.hadoop.conf.Configuration)'
    at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.<init>(HiveMetaStoreProxy.java:110)
    at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.getInstance(HiveMetaStoreProxy.java:139)
    at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveTableUtils.getTableInfo(HiveTableUtils.java:41)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.config.HiveSourceConfig.<init>(HiveSourceConfig.java:84)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.config.MultipleTableHiveSourceConfig.parseFromLocalFileSourceConfig(MultipleTableHiveSourceConfig.java:52)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.config.MultipleTableHiveSourceConfig.<init>(MultipleTableHiveSourceConfig.java:39)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.HiveSource.<init>(HiveSource.java:43)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.HiveSourceFactory.lambda$createSource$0(HiveSourceFactory.java:46)
    at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:113)
    at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:74)
    ... 7 more

2024-08-07 11:10:59,058 ERROR [o.a.s.c.s.SeaTunnel           ] [main] -
===============================================================================

Exception in thread "main" org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
    at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:211)
    at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
    at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.api.table.factory.FactoryException: ErrorCode:[API-06], ErrorDescription:[Factory initialize failed] - Unable to create a source for identifier 'Hive'.
    at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:101)
    at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSource(MultipleTableJobConfigParser.java:361)
    at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:209)
    at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.getLogicalDag(ClientJobExecutionEnvironment.java:114)
    at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.execute(ClientJobExecutionEnvironment.java:182)
    at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:158)
    ... 2 more
Caused by: java.lang.NoSuchMethodError: 'void org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(org.apache.hadoop.conf.Configuration)'
    at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.<init>(HiveMetaStoreProxy.java:110)
    at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.getInstance(HiveMetaStoreProxy.java:139)
    at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveTableUtils.getTableInfo(HiveTableUtils.java:41)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.config.HiveSourceConfig.<init>(HiveSourceConfig.java:84)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.config.MultipleTableHiveSourceConfig.parseFromLocalFileSourceConfig(MultipleTableHiveSourceConfig.java:52)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.config.MultipleTableHiveSourceConfig.<init>(MultipleTableHiveSourceConfig.java:39)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.HiveSource.<init>(HiveSource.java:43)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.HiveSourceFactory.lambda$createSource$0(HiveSourceFactory.java:46)
    at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:113)
    at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:74)
    ... 7 more

Zeta or Flink or Spark Version

Zeta 2.3.6

Java or Scala Version

java 17

Screenshots

No response

Are you willing to submit PR?

Code of Conduct

2416210017 commented 1 month ago

I also reported the same error, have you found the reason?

24/08/09 11:57:35 INFO utils.HiveMetaStoreProxy: hive client conf:Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@3e1897d, file:/etc/hive/conf.cloudera.hive/hive-site.xml, file:/etc/hive/conf/hive-site.xml
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(Lorg/apache/hadoop/conf/Configuration;)V
        at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.<init>(HiveMetaStoreProxy.java:110)
        at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.getInstance(HiveMetaStoreProxy.java:139)
        at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveTableUtils.getTableInfo(HiveTableUtils.java:41)
        at org.apache.seatunnel.connectors.seatunnel.hive.sink.HiveSink.getTableInformation(HiveSink.java:235)
        at org.apache.seatunnel.connectors.seatunnel.hive.sink.HiveSink.<init>(HiveSink.java:85)
        at org.apache.seatunnel.connectors.seatunnel.hive.sink.HiveSinkFactory.lambda$createSink$0(HiveSinkFactory.java:61)
        at org.apache.seatunnel.core.starter.spark.execution.SinkExecuteProcessor.execute(SinkExecuteProcessor.java:140)
        at org.apache.seatunnel.core.starter.spark.execution.SparkExecution.execute(SparkExecution.java:71)
        at org.apache.seatunnel.core.starter.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:60)
        at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
        at org.apache.seatunnel.core.starter.spark.SeaTunnelSpark.main(SeaTunnelSpark.java:35)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
liunaijie commented 1 month ago

hive version conflict.

in 2.3.6. upgrade hive from 2.3.9 to 3.1.3.

matianhe3 commented 1 month ago

hive version conflict.

in 2.3.6. upgrade hive from 2.3.9 to 3.1.3.

i must upgrade my hive? or only change hive-exec.jar ?

liunaijie commented 1 month ago

hive version conflict. in 2.3.6. upgrade hive from 2.3.9 to 3.1.3.

i must upgrade my hive? or only change hive-exec.jar ?

just replace the hive related jar.

2416210017 commented 1 month ago

hive version conflict. in 2.3.6. upgrade hive from 2.3.9 to 3.1.3.

i must upgrade my hive? or only change hive-exec.jar ?

just replace the hive related jar.

May I ask which path I need to replace the hive.jar package under?

matianhe3 commented 1 month ago

hive version conflict. in 2.3.6. upgrade hive from 2.3.9 to 3.1.3.

i must upgrade my hive? or only change hive-exec.jar ?

just replace the hive related jar.

If you use SeaTunnel Engine, You need put seatunnel-hadoop3-3.1.4-uber.jar and hive-exec-3.1.3.jar and libfb303-0.9.3.jar in $SEATUNNEL_HOME/lib/ dir.

i follow the doc, put hive-exec-3.1.3.jar and libfb303-0.9.3.jar to /opt/seatunnel/lib , but can not work.


Exception in thread "main" org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
    at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:211)
    at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
    at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.api.table.factory.FactoryException: ErrorCode:[API-06], ErrorDescription:[Factory initialize failed] - Unable to create a source for identifier 'Hive'.
    at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:101)
    at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSource(MultipleTableJobConfigParser.java:361)
    at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:209)
    at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.getLogicalDag(ClientJobExecutionEnvironment.java:114)
    at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.execute(ClientJobExecutionEnvironment.java:182)
    at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:158)
    ... 2 more
Caused by: org.apache.seatunnel.connectors.seatunnel.hive.exception.HiveConnectorException: ErrorCode:[HIVE-03], ErrorDescription:[Get hive table information from hive metastore service failed] - Get table [bdm.b_tianrun_ibs] information failed
    at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.getTable(HiveMetaStoreProxy.java:152)
    at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveTableUtils.getTableInfo(HiveTableUtils.java:43)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.config.HiveSourceConfig.<init>(HiveSourceConfig.java:84)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.config.MultipleTableHiveSourceConfig.parseFromLocalFileSourceConfig(MultipleTableHiveSourceConfig.java:52)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.config.MultipleTableHiveSourceConfig.<init>(MultipleTableHiveSourceConfig.java:39)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.HiveSource.<init>(HiveSource.java:43)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.HiveSourceFactory.lambda$createSource$0(HiveSourceFactory.java:46)
    at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:113)
    at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:74)
    ... 7 more
Caused by: org.apache.thrift.TApplicationException: Invalid method name: 'get_table_req'
    at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table_req(ThriftHiveMetastore.java:2079)
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table_req(ThriftHiveMetastore.java:2066)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1578)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1570)
    at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.getTable(HiveMetaStoreProxy.java:148)
matianhe3 commented 1 month ago

hive version conflict. in 2.3.6. upgrade hive from 2.3.9 to 3.1.3.

i must upgrade my hive? or only change hive-exec.jar ?

just replace the hive related jar.

Hive 2.1.1-cdh6.2.1 Subversion file:///container.redhat7/build/cdh/hive/2.1.1-cdh6.2.1/rpm/BUILD/hive-2.1.1-cdh6.2.1 -r fc0582dc4d2e2342d1eef8cb702bad68e1e2cdd2

liunaijie commented 4 weeks ago

hive version conflict. in 2.3.6. upgrade hive from 2.3.9 to 3.1.3.

i must upgrade my hive? or only change hive-exec.jar ?

just replace the hive related jar.

Hive 2.1.1-cdh6.2.1 Subversion file:///container.redhat7/build/cdh/hive/2.1.1-cdh6.2.1/rpm/BUILD/hive-2.1.1-cdh6.2.1 -r fc0582dc4d2e2342d1eef8cb702bad68e1e2cdd2

org.apache.thrift.TApplicationException: Invalid method name: 'get_table_req' This error means that you are using incompatible version of Hive metastore client to connect the Hive metastore server.

base on your describe, you can work well with seatunnel v.2.35. so you can copy the related hive lib from 2.3.5 to 2.3.6, and tst again.

matianhe3 commented 4 weeks ago

hive version conflict. in 2.3.6. upgrade hive from 2.3.9 to 3.1.3.

i must upgrade my hive? or only change hive-exec.jar ?

just replace the hive related jar.

Hive 2.1.1-cdh6.2.1 Subversion file:///container.redhat7/build/cdh/hive/2.1.1-cdh6.2.1/rpm/BUILD/hive-2.1.1-cdh6.2.1 -r fc0582dc4d2e2342d1eef8cb702bad68e1e2cdd2

org.apache.thrift.TApplicationException: Invalid method name: 'get_table_req' This error means that you are using incompatible version of Hive metastore client to connect the Hive metastore server.

base on your describe, you can work well with seatunnel v.2.35. so you can copy the related hive lib from 2.3.5 to 2.3.6, and tst again.

Seatunnel v2.3.5 lib is hive-exec-2.3.9.jar, i use this have the first error. Unable to create a source for identifier 'Hive'.

tiansww commented 3 weeks ago

maybe you can copy the hive connector from 2.3.5 to 2.3.6

2416210017 commented 3 weeks ago

maybe you can copy the hive connector from 2.3.5 to 2.3.6

May I ask which directory to copy to?

matianhe3 commented 3 weeks ago

maybe you can copy the hive connector from 2.3.5 to 2.3.6

i copy connectors/connector-hive-2.3.5.jar to connectors/connector-hive-2.3.6.jar

Exception in thread "main" java.lang.NoSuchMethodError: 'void org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(org.apache.hadoop.hive.conf.HiveConf)'
    at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.<init>(HiveMetaStoreProxy.java:84)
    at org.apache.seatunnel.connectors.seatunnel.hive.utils.HiveMetaStoreProxy.getInstance(HiveMetaStoreProxy.java:113)
    at org.apache.seatunnel.connectors.seatunnel.hive.config.HiveConfig.getTableInfo(HiveConfig.java:73)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.HiveSource.prepare(HiveSource.java:124)
    at org.apache.seatunnel.engine.core.parse.JobConfigParser.parseSource(JobConfigParser.java:83)
    at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parseSource(MultipleTableJobConfigParser.java:356)
    at org.apache.seatunnel.engine.core.parse.MultipleTableJobConfigParser.parse(MultipleTableJobConfigParser.java:209)
    at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.getLogicalDag(ClientJobExecutionEnvironment.java:114)
    at org.apache.seatunnel.engine.client.job.ClientJobExecutionEnvironment.execute(ClientJobExecutionEnvironment.java:182)
    at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:158)
    at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
    at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
uniding commented 1 week ago

I also encountered the same error, the Hive version is 2.3.7