hortonworks-spark / spark-llap

Apache License 2.0
102 stars 68 forks source link

GSS initiate failed #280

Open hwkim94 opened 4 years ago

hwkim94 commented 4 years ago

Hello, when I connect pyspark and hive using HWC , I met this error (my cluseter is kerberized)

y4JJavaError: An error occurred while calling o164.executeQuery.
: java.lang.RuntimeException: java.io.IOException: java.sql.SQLException: Could not open client transport for any of the Server URI's in ZooKeeper: GSS initiate failed
    at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.readSchema(HiveWarehouseDataSourceReader.java:130)
    at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation$.apply(DataSourceV2Relation.scala:56)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:224)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
    at com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl.executeQuery(HiveWarehouseSessionImpl.java:62)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: java.sql.SQLException: Could not open client transport for any of the Server URI's in ZooKeeper: GSS initiate failed
    at org.apache.hadoop.hive.llap.LlapBaseInputFormat.getSplits(LlapBaseInputFormat.java:298)
    at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.getTableSchema(HiveWarehouseDataSourceReader.java:110)
    at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.readSchema(HiveWarehouseDataSourceReader.java:124)
    ... 15 more
Caused by: java.sql.SQLException: Could not open client transport for any of the Server URI's in ZooKeeper: GSS initiate failed
    at shadehive.org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:333)
    at shadehive.org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:247)
    at org.apache.hadoop.hive.llap.LlapBaseInputFormat.getSplits(LlapBaseInputFormat.java:272)
    ... 17 more
Caused by: org.apache.thrift.transport.TTransportException: GSS initiate failed
    at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
    at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
    at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
    at shadehive.org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:51)
    at shadehive.org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:48)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at shadehive.org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport.open(TUGIAssumingTransport.java:48)
    at shadehive.org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:420)
    at shadehive.org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:301)
    ... 21 more

(<class 'py4j.protocol.Py4JJavaError'>, Py4JJavaError(u'An error occurred while calling o164.executeQuery.\n', JavaObject id=o165), <traceback object at 0x7f808cc67b48>)

But I already set all settings in zeppelin ( https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_zeppelin_configuration_hivewarehouseconnector.html ) with spark.yarn.security.credentials.hiveserver2.enabled=ture (Hive Query is working. Only pyspark code with HWC is not working. )

%spark2.pyspark

sc.addPyFile('/usr/hdp/current/hive_warehouse_connector/pyspark_hwc-1.0.0.3.1.0.0-78.zip')
from pyspark_llap import HiveWarehouseSession

hive = HiveWarehouseSession.session(spark).build()
hive.executeQuery("select * from db123.table123 limit 10").show(10)

my env is

How can I connect Hive and Spark? Please give me solution Thanks to your favor

hwkim94 commented 4 years ago

If I remove 'user impersonate' option from spark interpreter, this error is not ocuured. But, another error occurs.

Py4JJavaError: An error occurred while calling o116.showString.
: java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: shadehive.org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
    at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.createBatchDataReaderFactories(HiveWarehouseDataSourceReader.java:166)
    at org.apache.spark.sql.execution.datasources.v2.DataSourceV2ScanExec.inputRDD$lzycompute(DataSourceV2ScanExec.scala:64)
    at org.apache.spark.sql.execution.datasources.v2.DataSourceV2ScanExec.inputRDD(DataSourceV2ScanExec.scala:60)
    at org.apache.spark.sql.execution.datasources.v2.DataSourceV2ScanExec.inputRDDs(DataSourceV2ScanExec.scala:79)
    at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:605)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
    at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247)
    at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:337)
    at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
    at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3278)
    at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2489)
    at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2489)
    at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
    at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)
    at org.apache.spark.sql.Dataset.head(Dataset.scala:2489)
    at org.apache.spark.sql.Dataset.take(Dataset.scala:2703)
    at org.apache.spark.sql.Dataset.showString(Dataset.scala:254)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.IOException: shadehive.org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
    at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.getSplitsFactories(HiveWarehouseDataSourceReader.java:182)
    at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.createBatchDataReaderFactories(HiveWarehouseDataSourceReader.java:162)
    ... 33 more
Caused by: java.io.IOException: shadehive.org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
    at org.apache.hadoop.hive.llap.LlapBaseInputFormat.getSplits(LlapBaseInputFormat.java:298)
    at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataSourceReader.getSplitsFactories(HiveWarehouseDataSourceReader.java:176)
    ... 34 more
Caused by: shadehive.org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
    at shadehive.org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:300)
    at shadehive.org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:286)
    at shadehive.org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:379)
    at org.apache.hadoop.hive.llap.LlapBaseInputFormat.getSplits(LlapBaseInputFormat.java:280)
    ... 35 more
Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
    at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:478)
    at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328)
    at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:952)
    at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
    at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
    at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
    at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
    at com.sun.proxy.$Proxy75.fetchResults(Unknown Source)
    at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:564)
    at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:792)
    at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837)
    at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822)
    at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
    at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
    at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:647)
    at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    ... 1 more
Caused by: java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
    at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:162)
    at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2738)
    at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
    at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
    ... 24 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.lang.NullPointerException
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:225)
    at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
    at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
    at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
    at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
    at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
    at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
    at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
    ... 27 more
Caused by: java.io.IOException: java.lang.NullPointerException
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.getSplits(GenericUDTFGetSplits.java:498)
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:210)
    ... 38 more
Caused by: java.lang.NullPointerException: null
    at org.apache.hadoop.hive.llap.LlapUtil.generateClusterName(LlapUtil.java:117)
    at org.apache.hadoop.hive.llap.coordinator.LlapCoordinator.getLlapSigner(LlapCoordinator.java:103)
    at org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.getSplits(GenericUDTFGetSplits.java:441)
    ... 39 more

(<class 'py4j.protocol.Py4JJavaError'>, Py4JJavaError(u'An error occurred while calling o116.showString.\n', JavaObject id=o117), <traceback object at 0x7f563ecceb48>)

This is my zeppelin settings image image