apache / kyuubi

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
https://kyuubi.apache.org/
Apache License 2.0
2.11k stars 916 forks source link

[Bug][1.7] Flink engine fails to create Hive catalog on non-kerberized environment #4762

Closed yuanoOo closed 1 year ago

yuanoOo commented 1 year ago

Code of Conduct

Search before asking

Describe the bug

create paimon catalog on flink engine failed,when i upgrade kyuubi from 1.6.0 to 1.7.0.

create paimon catalog success in kyuubi 1.6.0 create paimon catalog success in fink sql client too

and create paimon catalog without hive conf, also success, like this:

CREATE CATALOG `paimon` WITH (
    'type' = 'paimon',
    'warehouse' = 'hdfs://Cluster/tmp'
);

so this bug may be related to hive

i got following error:

Caused by: java.lang.RuntimeException: org.apache.flink.table.api.ValidationException:Unable to create catalog 'paimon'.

Catalog options are:
'metastore'='hive'
'type'='paimon'
'uri'='thrift://xxxxx:9083'
'warehouse'='hdfs://Cluster/user/hive/warehouse/'
at org.apache.flink.table.factories.FactoryUtil.createCatalog(FactoryUtil.java:431) ~[?:?]
        at org.apache.flink.table.api.internal.TableEnvironmentImpl.createCatalog(TableEnvironmentImpl.java:1356) ~[?:?]
        at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeInternal(TableEnvironmentImpl.java:1111) ~[?:?]
        at org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeOperation$3(LocalExecutor.java:209) ~[?:?]
        at org.apache.flink.table.client.gateway.context.ExecutionContext.wrapClassLoader(ExecutionContext.java:88) ~[?:?]
        at org.apache.flink.table.client.gateway.local.LocalExecutor.executeOperation(LocalExecutor.java:209) ~[?:?]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_202]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_202]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_202]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_202]
        at org.apache.kyuubi.reflection.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:59) ~[kyuubi-common_2.12-1.7.0.jar:1.7.0]
        at org.apache.kyuubi.reflection.DynMethods$UnboundMethod.invoke(DynMethods.java:75) ~[kyuubi-common_2.12-1.7.0.jar:1.7.0]
        at org.apache.kyuubi.reflection.DynMethods$BoundMethod.invoke(DynMethods.java:175) ~[kyuubi-common_2.12-1.7.0.jar:1.7.0]
        at org.apache.kyuubi.engine.flink.operation.ExecuteStatement.runOperation(ExecuteStatement.scala:162) ~[?:?]
        at org.apache.kyuubi.engine.flink.operation.ExecuteStatement.executeStatement(ExecuteStatement.scala:93) ~[?:?]
Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
        at org.apache.paimon.hive.HiveCatalog.createClient(HiveCatalog.java:559) ~[?:?]
        at org.apache.paimon.hive.HiveCatalog.<init>(HiveCatalog.java:112) ~[?:?]
        at org.apache.paimon.hive.HiveCatalogFactory.create(HiveCatalogFactory.java:76) ~[?:?]
        at org.apache.paimon.catalog.CatalogFactory.createCatalog(CatalogFactory.java:113) ~[?:?]
        at org.apache.paimon.flink.FlinkCatalogFactory.createCatalog(FlinkCatalogFactory.java:68) ~[?:?]
        at org.apache.paimon.flink.FlinkCatalogFactory.createCatalog(FlinkCatalogFactory.java:58) ~[?:?]
        at org.apache.paimon.flink.FlinkCatalogFactory.createCatalog(FlinkCatalogFactory.java:32) ~[?:?]
        at org.apache.flink.table.factories.FactoryUtil.createCatalog(FactoryUtil.java:428) ~[?:?]
Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException:null
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_202]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_202]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_202]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_202]
        at org.apache.paimon.hive.HiveCatalog.createClient(HiveCatalog.java:551) ~[?:?]
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
        at org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:86) ~[hive-standalone-metastore-3.1.3.jar:3.1.3]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:95) ~[hive-standalone-metastore-3.1.3.jar:3.1.3]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:148) ~[hive-standalone-metastore-3.1.3.jar:3.1.3]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:119) ~[hive-standalone-metastore-3.1.3.jar:3.1.3]
Caused by: java.lang.RuntimeException: java.lang.reflect.InvocationTargetException:null
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_202]
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_202]
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_202]
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_202]
        at org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:84) ~[hive-standalone-metastore-3.1.3.jar:3.1.3]
Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.metastore.api.MetaException:java.lang.NullPointerException
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result$get_delegation_token_resultStandardScheme.read(ThriftHiveMetastore.java) ~[hive-standalone-metastore-3.1.3.jar:3.1.3]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result$get_delegation_token_resultStandardScheme.read(ThriftHiveMetastore.java) ~[hive-standalone-metastore-3.1.3.jar:3.1.3]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_delegation_token_result.read(ThriftHiveMetastore.java) ~[hive-standalone-metastore-3.1.3.jar:3.1.3]
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) ~[libthrift-0.9.3.jar:0.9.3]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_delegation_token(ThriftHiveMetastore.java:4814) ~[hive-standalone-metastore-3.1.3.jar:3.1.3]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_delegation_token(ThriftHiveMetastore.java:4800) ~[hive-standalone-metastore-3.1.3.jar:3.1.3]
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDelegationToken(HiveMetaStoreClient.java:2351) ~[hive-standalone-metastore-3.1.3.jar:3.1.3]
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:211) ~[hive-standalone-metastore-3.1.3.jar:3.1.3]

Affects Version(s)

1.7.0

Kyuubi Server Log Output

No response

Kyuubi Engine Log Output

No response

Kyuubi Server Configurations

No response

Kyuubi Engine Configurations

___sdsad___.kyuubi.engine.type=FLINK_SQL
flink.execution.target=yarn-session
#Yarn Session Cluster application id.
flink.yarn.application.id=application_1679188583928_0071

Additional context

flink version: 1.15.3

Are you willing to submit PR?

github-actions[bot] commented 1 year ago

Hello @yuanoOo, Thanks for finding the time to report the issue! We really appreciate the community's efforts to improve Apache Kyuubi.

pan3793 commented 1 year ago

The stacktrace indicates the HMS returns null, any error in HMS?

pan3793 commented 1 year ago

cc @SteNicholas

SteNicholas commented 1 year ago

@yuanoOo, could you take a look at the Hive dependencies especially metastore client dependency jar?

yuanoOo commented 1 year ago

@yuanoOo, could you take a look at the Hive dependencies especially metastore client dependency jar?

Can you reproduce the bug? I don't quite understand where you want me to look...

Here are all the files in the ${FLINK_HOME}/lib directory

-rw-r--r--. 1 hadoop hadoop    194420 Nov 10 23:17 flink-cep-1.15.3.jar
-rw-r--r--. 1 hadoop hadoop    485951 Nov 10 23:21 flink-connector-files-1.15.3.jar
-rw-r--r--. 1 hadoop hadoop     95188 Nov 10 23:29 flink-csv-1.15.3.jar
-rw-r--r--. 1 hadoop hadoop 115829784 Nov 10 23:38 flink-dist-1.15.3.jar
-rw-rw-r--. 1 hadoop hadoop    175491 Feb 20 11:56 flink-json-1.15.3.jar
-rw-r--r--. 1 hadoop hadoop  21041721 Nov 10 23:35 flink-scala_2.12-1.15.3.jar
-rw-r--r--. 1 hadoop hadoop  10737871 Jul 21  2022 flink-shaded-zookeeper-3.5.9.jar
-rw-rw-r--. 1 hadoop hadoop  50741992 Feb 22 08:54 flink-sql-connector-hive-3.1.2_2.12-1.15.3.jar
-rw-rw-r--. 1 hadoop hadoop   5179764 Feb 20 11:56 flink-sql-connector-kafka-1.15.3.jar
-rw-r--r--. 1 hadoop hadoop  15264818 Nov 10 23:35 flink-table-api-java-uber-1.15.3.jar
-rw-r--r--. 1 hadoop hadoop  36263447 Nov 10 23:35 flink-table-planner-loader-1.15.3.jar
-rw-r--r--. 1 hadoop hadoop   2996569 Nov 10 23:17 flink-table-runtime-1.15.3.jar
-rw-r--r--. 1 hadoop hadoop   1657002 Apr 24 09:43 hadoop-mapreduce-client-core-3.2.1.jar
-rw-rw-r--. 1 hadoop hadoop  52377705 Feb 20 17:18 hudi-flink1.15-bundle-0.12.2.jar
-rw-r--r--. 1 hadoop hadoop    208006 Jul 21  2022 log4j-1.2-api-2.17.1.jar
-rw-r--r--. 1 hadoop hadoop    301872 Jul 21  2022 log4j-api-2.17.1.jar
-rw-r--r--. 1 hadoop hadoop   1790452 Jul 21  2022 log4j-core-2.17.1.jar
-rw-r--r--. 1 hadoop hadoop     24279 Jul 21  2022 log4j-slf4j-impl-2.17.1.jar
-rw-rw-r--. 1 hadoop hadoop  26673814 Apr 24 09:32 paimon-flink-1.15-0.4-20230424.001927-40.jar
pan3793 commented 1 year ago

This is caused by https://github.com/apache/kyuubi/pull/3604, since 1.7.0, Kyuubi propagates env var HADOOP_PROXY_USER to Flink process on launching Flink engine, when detects this env var, Hive will call getDelegationToken on initializing HMS client.

    //If HADOOP_PROXY_USER is set in env or property,
    //then need to create metastore client that proxies as that user.
    String HADOOP_PROXY_USER = "HADOOP_PROXY_USER";
    String proxyUser = System.getenv(HADOOP_PROXY_USER);
    if (proxyUser == null) {
      proxyUser = System.getProperty(HADOOP_PROXY_USER);
    }
    //if HADOOP_PROXY_USER is set, create DelegationToken using real user
    if(proxyUser != null) {
      LOG.info(HADOOP_PROXY_USER + " is set. Using delegation "
          + "token for HiveMetaStore connection.");
      try {
        UserGroupInformation.getLoginUser().getRealUser().doAs(
            new PrivilegedExceptionAction<Void>() {
              @Override
              public Void run() throws Exception {
                open();
                return null;
              }
            });
        String delegationTokenPropString = "DelegationTokenForHiveMetaStoreServer";
        String delegationTokenStr = getDelegationToken(proxyUser, proxyUser);
        Utils.setTokenStr(UserGroupInformation.getCurrentUser(), delegationTokenStr,
            delegationTokenPropString);
        this.conf.setVar(ConfVars.METASTORE_TOKEN_SIGNATURE, delegationTokenPropString);
        close();
      } catch (Exception e) {
        LOG.error("Error while setting delegation token for " + proxyUser, e);
        if(e instanceof MetaException) {
          throw (MetaException)e;
        } else {
          throw new MetaException(e.getMessage());
        }
      }
    }
pan3793 commented 1 year ago

@SteNicholas @link3280 @yanghua How to set the user on submitting Flink Job on non-kerberized env?

pan3793 commented 1 year ago

cc @bowenliang123 I think we should fix this regression in 1.7.1

pan3793 commented 1 year ago

Simply revert https://github.com/apache/kyuubi/commit/26b78f5c1fb02b9e6bdce96b633bacddb43c229f should restore the behavior, @yuanoOo would you like to have a try?

But I would like to discuss the right behavior here before reverting on upstream.

pan3793 commented 1 year ago

https://github.com/apache/kyuubi/commit/26b78f5c1fb02b9e6bdce96b633bacddb43c229f was reverted to unblock 1.7.1 release

bowenliang123 commented 1 year ago

Is this issue still a blocker for 1.7.1, as https://github.com/apache/kyuubi/commit/26b78f5c1fb02b9e6bdce96b633bacddb43c229f has been reverted ?

pan3793 commented 1 year ago

@bowenliang123 it's not a blocker now