apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
6.18k stars 2.15k forks source link

iceberg HiveCatalog insert exception of GSS initiate failed #3127

Closed Neo966 closed 1 month ago

Neo966 commented 3 years ago

the operation as follow

use test;
add jar iceberg-hive-runtime-0.12.0.jar;
add jar datanucleus-core-4.1.16.jar;

CREATE TABLE iceberg_hive_test (
  id bigint, name string
) PARTITIONED BY (
  dept string
) STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler';

insert into iceberg_hive_test values(1, 'test1', 'test2');

the exception as follow

2021-09-16 11:43:14,346 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: Setting job diagnostics to Job commit failed: org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive Metastore at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:62) at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:32) at org.apache.iceberg.ClientPoolImpl.get(ClientPoolImpl.java:118) at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:49) at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:76) at org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:181) at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:94) at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:77) at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:93) at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:115) at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:105) at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitTable(HiveIcebergOutputCommitter.java:280) at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.lambda$commitJob$2(HiveIcebergOutputCommitter.java:193) at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:405) at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214) at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198) at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190) at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitJob(HiveIcebergOutputCommitter.java:188) at org.apache.hadoop.mapred.OutputCommitter.commitJob(OutputCommitter.java:291) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:286) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:238) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: GSS initiate failed at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:562) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:351) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:213) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.iceberg.common.DynConstructors$Ctor.newInstanceChecked(DynConstructors.java:60) at org.apache.iceberg.common.DynConstructors$Ctor.newInstance(DynConstructors.java:73) at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:53) at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:32) at org.apache.iceberg.ClientPoolImpl.get(ClientPoolImpl.java:118) at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:49) at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:76) at org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:181) at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:94) at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:77) at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:93) at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:115) at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:105) at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitTable(HiveIcebergOutputCommitter.java:280) at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.lambda$commitJob$2(HiveIcebergOutputCommitter.java:193) at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:405) at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214) at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198) at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190) at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitJob(HiveIcebergOutputCommitter.java:188) at org.apache.hadoop.mapred.OutputCommitter.commitJob(OutputCommitter.java:291) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:286) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:238) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:610) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:351) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:213) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.iceberg.common.DynConstructors$Ctor.newInstanceChecked(DynConstructors.java:60) at org.apache.iceberg.common.DynConstructors$Ctor.newInstance(DynConstructors.java:73) at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:53) ... 23 more

Neo966 commented 3 years ago

hive shell execute kerberos failure hive jdbc(beeline) execute success

pvary commented 3 years ago

Seems like a problem with the Kerberos communication in Hive shell. Both te CREATE TABLE and the INSERT is issued from Hive shell?

Neo966 commented 3 years ago

Seems like a problem with the Kerberos communication in Hive shell. Both te CREATE TABLE and the INSERT is issued from Hive shell?

@pvary yes, create table works but insert not works in hive shell.

insert works but select not works in hive jdbc(beeline) #3146 select works but insert not works in hive shell create table both work in hive jdbc and hive shell

pvary commented 3 years ago

Could you please try this with Hive 2.3.8? We do all of the testing with that version.

Thanks, Peter

pvary commented 3 years ago

Or alternatively 3.1.2

ywww commented 2 years ago

hive3.1.2 the same proplem


hive> drop table test.hive_iceberg_test1; OK Time taken: 2.002 seconds hive> CREATE TABLE test.hive_iceberg_test1 (

id bigint, name string ) PARTITIONED BY ( ds string ) STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'; OK Time taken: 1.229 seconds hive> insert into table test.hive_iceberg_test1 select 1,'1',1;; Query ID = root_20211216194143_6388a8d2-9b42-47a5-8685-587d43d0eaaf Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1639039128241_0038, Tracking URL = http://emr-header-1.cluster-267641:20888/proxy/application_1639039128241_0038/ Kill Command = /usr/local/complat/adp/hadoop/bin/mapred job -kill job_1639039128241_0038 Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 0 2021-12-16 19:41:56,284 Stage-2 map = 0%, reduce = 0% 2021-12-16 19:42:03,664 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 2.62 sec MapReduce Total cumulative CPU time: 2 seconds 620 msec Ended Job = job_1639039128241_0038 with errors Error during job, obtaining debugging information... Job Tracking URL: http://emr-header-1.cluster-267641:20888/proxy/application_1639039128241_0038/ Examining task ID: task_1639039128241_0038_m_000000 (and more) from job job_1639039128241_0038 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-2: Map: 1 Cumulative CPU: 2.62 sec HDFS Read: 178200 HDFS Write: 3484 FAIL Total MapReduce CPU Time Spent: 2 seconds 620 msec

2021-12-16 19:42:03,687 INFO [CommitterEvent Processor #1] org.apache.hadoop.hive.metastore.HiveMetaStoreClient: HMSC::open(): Could not find delegation token. Creating KERBEROS-based thrift connection. 2021-12-16 19:42:03,718 ERROR [CommitterEvent Processor #1] org.apache.thrift.transport.TSaslTransport: SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:51) at org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:48) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1732) at org.apache.hadoop.hive.metastore.security.TUGIAssumingTransport.open(TUGIAssumingTransport.java:48) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:516) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:224) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:137) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.iceberg.common.DynConstructors$Ctor.newInstanceChecked(DynConstructors.java:60) at org.apache.iceberg.common.DynConstructors$Ctor.newInstance(DynConstructors.java:73) at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:53) at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:32) at org.apache.iceberg.ClientPoolImpl.get(ClientPoolImpl.java:118) at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:49) at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:76) at org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:181) at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:94) at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:77) at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:93) at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:115) at org.apache.iceberg.mr.Catalogs.loadTable(Catalogs.java:105) at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitTable(HiveIcebergOutputCommitter.java:280) at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.lambda$commitJob$2(HiveIcebergOutputCommitter.java:193) at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:405) at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:214) at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:198) at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:190) at org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter.commitJob(HiveIcebergOutputCommitter.java:188) at org.apache.hadoop.mapred.OutputCommitter.commitJob(OutputCommitter.java:291) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:286) at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:238) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:162) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:189) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)

ywww commented 2 years ago

the same problem

pvary commented 2 years ago

Which execution engine are you using? Are you still using Iceberg 0.12?

There is a little section on the Hive documentation page describing the supported engines.

We have unit tests for different engines, but not for HiceCli, or Kerberos. We test in-house with Kerberos, Tez, Hive4 and it is working fine. It would be good to have a clear understanding of the versions/engines/environment where you face this issue.

Thanks, Peter

zhongqiangczq commented 1 year ago

by using hivecli + mr + insertinto i also meet the same problem "javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] ". but, if use hivecli + tez + insertinto, it will no errors but no datas are insertted into the table

lurnagao-dahua commented 1 year ago

the same problem by using hivecli(2.3.7) + mr + insert into iceberg_table(0.13.2)

hzxiongyinke commented 11 months ago

The root cause is that the Hive Client CliDriver does not write the HMS DelegationToken into the UGI, while HiveServer2 has done.

lizu18xz commented 9 months ago

The root cause is that the Hive Client CliDriver does not write the HMS DelegationToken into the UGI, while HiveServer2 has done.

Hello, may I ask when this problem can be resolved

ma311199 commented 9 months ago

Using CDH hive2.1.1+iceberg0.13.1+kerberos.May I ask for a solution to this problem?

023-12-11 10:38:12,167 ERROR [CommitterEvent Processor #1] org.apache.thrift.transport.TSaslTransport: SASL negotiation failure javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] commented 1 month ago

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'