awslabs / aws-glue-data-catalog-client-for-apache-hive-metastore

The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog as an external Hive Metastore. It serves as a reference implementation for building a Hive Metastore-compatible client that connects to the AWS Glue Data Catalog. It may be ported to other Hive Metastore-compatible platforms such as other Hadoop and Apache Spark distributions
https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hive-metastore-glue.html
Apache License 2.0
205 stars 120 forks source link

JDBC getColumns fails with NPE #17

Open FejfarKamil opened 5 years ago

FejfarKamil commented 5 years ago

https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore/blob/d04285fa7952ddc01df888a8e0b55229105dfef2/aws-glue-datacatalog-hive2-client/src/main/java/com/amazonaws/glue/catalog/metastore/AWSCatalogMetastoreClient.java#L1630

I connect to EMR cluster using JDBC and I want to list table columns. org.apache.hive.jdbc.HiveDatabaseMetaData.getColumns(...) will call org.apache.hive.service.cli.operation.GetColumnsOperation that fails, because it expects non-null primary keys:

List<SQLPrimaryKey> primaryKeys = metastoreClient.getPrimaryKeys(new PrimaryKeysRequest(dbName, table.getTableName()));
Set<String> pkColNames = new HashSet<>();
for(SQLPrimaryKey key : primaryKeys) { // primaryKeys is null, so NPE
    pkColNames.add(key.getColumn_name().toLowerCase());
}

Stacktrace: org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetColumns.getResult(TCLIService.java:1557) at org.apache.hive.service.rpc.thrift.TCLIService$Processor$GetColumns.getResult(TCLIService.java:1542) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ... 1 common frames omitted Caused by: java.lang.NullPointerException: null at org.apache.hive.service.cli.operation.GetColumnsOperation.runInternal(GetColumnsOperation.java:173) ... 25 common frames omitted

xujiongda commented 4 years ago

Have the same issue...