apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
7.94k stars 1.8k forks source link

hive读取内部表报错,外部表正常 #5141

Closed 1498658503 closed 1 year ago

1498658503 commented 1 year ago

Search before asking

What happened

在读取hive表时,报错,hive是ambari环境下的

SeaTunnel Version

2.3.0

SeaTunnel Config

env {
execution.planner = "blink"
job.name = hive测试
}
source {
    Hive {
      metastore_uri = "thrift://ambari01:9083,thrift://ambari02:9083"
      table_name = "bigdata.test_hive_source"
      result_table_name = "_seatunnel_table_hive_81"
    }
}
transform  {
}
sink {
    Console {
      source_table_name = "_seatunnel_table_hive_81"
    }
}

Running Command

./bin/start-seatunnel-flink.sh --config config/source/hive.conf

Error Exception

[root@ambari01 apache-seatunnel-incubating-2.3.0-SNAPSHOT]# ./bin/start-seatunnel-flink.sh --config config/source/hive.conf 
Execute SeaTunnel Flink Job: ${FLINK_HOME}/bin/flink run -c org.apache.seatunnel.core.starter.flink.SeatunnelFlink /home/seatunnel/apache-seatunnel-incubating-2.3.0-SNAPSHOT/starter/seatunnel-flink-starter.jar --config config/source/hive.conf -Dpipeline.name=SeaTunnel
Setting HBASE_CONF_DIR=/etc/hbase/conf because no HBASE_CONF_DIR was set.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/flink/flink-1.13.6/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.4.0-315/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2023-07-24 12:11:10,070 WARN  org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory      [] - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.

------------------------------------------------------------
 The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: ErrorCode:[COMMON-09], ErrorDescription:[Get table schema from upstream data failed] - Get table schema from file [hdfs://xjky-ambari/warehouse/tablespace/managed/hive/bigdata.db/test_hive_source/test_par1=b/test_par2=c/delta_0000002_0000002_0000/_orc_acid_version] failed
    at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372)
    at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222)
    at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114)
    at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812)
    at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246)
    at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054)
    at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
    at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
    at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132)
Caused by: org.apache.seatunnel.connectors.seatunnel.file.exception.FileConnectorException: ErrorCode:[COMMON-09], ErrorDescription:[Get table schema from upstream data failed] - Get table schema from file [hdfs://xjky-ambari/warehouse/tablespace/managed/hive/bigdata.db/test_hive_source/test_par1=b/test_par2=c/delta_0000002_0000002_0000/_orc_acid_version] failed
    at org.apache.seatunnel.connectors.seatunnel.file.hdfs.source.BaseHdfsFileSource.prepare(BaseHdfsFileSource.java:93)
    at org.apache.seatunnel.connectors.seatunnel.hive.source.HiveSource.prepare(HiveSource.java:95)
    at org.apache.seatunnel.core.starter.flink.execution.SourceExecuteProcessor.initializePlugins(SourceExecuteProcessor.java:115)
    at org.apache.seatunnel.core.starter.flink.execution.AbstractPluginExecuteProcessor.<init>(AbstractPluginExecuteProcessor.java:53)
    at org.apache.seatunnel.core.starter.flink.execution.SourceExecuteProcessor.<init>(SourceExecuteProcessor.java:56)
    at org.apache.seatunnel.core.starter.flink.execution.FlinkExecution.<init>(FlinkExecution.java:77)
    at org.apache.seatunnel.core.starter.flink.command.FlinkApiTaskExecuteCommand.execute(FlinkApiTaskExecuteCommand.java:53)
    at org.apache.seatunnel.core.starter.Seatunnel.run(Seatunnel.java:39)
    at org.apache.seatunnel.core.starter.flink.SeatunnelFlink.main(SeatunnelFlink.java:34)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355)
    ... 11 more
Caused by: org.apache.seatunnel.connectors.seatunnel.file.exception.FileConnectorException: ErrorCode:[COMMON-12], ErrorDescription:[Source reader operation failed, such as (open, close) etc...] - Create orc reader for this file [hdfs://xjky-ambari/warehouse/tablespace/managed/hive/bigdata.db/test_hive_source/test_par1=b/test_par2=c/delta_0000002_0000002_0000/_orc_acid_version] failed
    at org.apache.seatunnel.connectors.seatunnel.file.source.reader.OrcReadStrategy.getSeaTunnelRowTypeInfo(OrcReadStrategy.java:136)
    at org.apache.seatunnel.connectors.seatunnel.file.hdfs.source.BaseHdfsFileSource.prepare(BaseHdfsFileSource.java:90)
    ... 24 more

Flink or Spark Version

flink1.13.6

Java or Scala Version

jdk 1.8

Screenshots

No response

Are you willing to submit PR?

Code of Conduct

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] commented 1 year ago

This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.

zhilinli123 commented 1 year ago

Hi Have you tried 2.3.3 There is this problem?