apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
5.39k stars 2.42k forks source link

How to use Kerberos in Hudi integrated Flink?data cannot be written to hdfs. #9636

Open FWLamb opened 1 year ago

FWLamb commented 1 year ago

image image image image

I have configured the Kerberos-related parameters in the Flink configuration file and executed the kinit command before entering SQL client, but it still failed.

Environment Description

danny0405 commented 1 year ago

Did you config the kerb certificate in you flink configurations?

FWLamb commented 1 year ago

Did you config the kerb certificate in you flink configurations?

Yes, please take a look at the fourth picture above.

danny0405 commented 1 year ago

Did you configure it on all the cluster nodes?

FWLamb commented 1 year ago

Did you configure it on all the cluster nodes?

I only deployed a single node for testing

FWLamb commented 1 year ago

When I submit a job by writing code, it can be successful, but this problem arises when using SQL client.

danny0405 commented 1 year ago

SQL client has it's own config yamls, not sure whether we can configure the kerb there

FWLamb commented 1 year ago

Flink1.14 and later versions have removed the sql-client-defaults. yaml file

jiaojietao commented 1 year ago

Can you write the group:user to which the path "tmp/hudi_flink/t1" belongs? Can you take a screenshot and see if your ticket should be configured as' hive:hive '

Toroidals commented 1 year ago

![Uploading image.png…]()

Translation: I also encountered the same problem. When integrating Flink with Hudi, I created a table in Flink-SQL and synchronized it to Hive. However, I cannot perform Hive JDBC Kerberos authentication during the synchronization to Hive. My Hive cluster does not have user passwords, only Kerberos authentication.

danny0405 commented 1 year ago

Did you try the HMS sync mode then instead of the JDBC.

wangzhenwen681 commented 10 months ago

I have also encountered this problem, and I would like to ask you how you can solve it

danny0405 commented 10 months ago

How did you configure the kerb certificate in your hive conf?

wangzhenwen681 commented 10 months ago

jobManager log 2023-12-22 16:35:07,730 WARN org.apache.hadoop.hive.metastore.HiveMetaStoreClient [] - set_ugi() not successful, Likely cause: new client talking to old server. Continuing without it. org.apache.thrift.transport.TTransportException: null hivemetastore log ERROR org.apache.thrift.server.TThreadPoolServer - Error occurred during processing of message. java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Invalid status -128 Caused by: org.apache.thrift.transport.TTransportException: Invalid status -128 environment hudi-0.14.0 flink-1.16.2 flinksql is CREATE TABLE test_hudi(

id int, num int, ts int, primary key (id) not enforced ) PARTITIONED BY (num) with( 'connector'='hudi', 'path' = 'hdfs:///data/test/test_hudi', 'table.type'='COPY_ON_WRITE',
'hive_sync.enable'='true', 'hive_sync.db'='test', 'hive_sync.table'='test_hudi', 'hive_sync.mode'='hms' ,
'hive_sync.metastore.uris' = 'thrift://datasophon01:9083', 'hive_sync.use_jdbc'='true', 'hive_sync.use_kerberos' = 'true', 'hive_sync.kerberos.krb5.conf' = '/etc/krb5.conf', 'hive_sync.kerberos.principal' = 'hive/datasophon01@HADOOP.COM', 'hive_sync.kerberos.keytab.file' = '/etc/security/keytab/hive.service.keytab', 'hive_sync.kerberos.keytab.name' = 'hive/datasophon01@HADOOP.COM'
);

ad1happy2go commented 10 months ago

There was an another issue which a user fixed like this - https://github.com/apache/hudi/issues/9269#issuecomment-1659637537

If it is related and works, we can implement it as a general solution. cc @danny0405