Open tooptoop4 opened 2 years ago
Can you try hive.metastore-timeout
?
https://trino.io/docs/current/connector/hive.html#metastore-configuration-properties
@ebyhr after changing hive.metastore-timeout to 120s i get different error below:
io.prestosql.spi.PrestoException: Insert of object "org.apache.hadoop.hive.metastore.model.MPartitionColumnStatistics@3637c229" using statement "INSERT INTO `PART_COL_STATS` (`CS_ID`,`AVG_COL_LEN`,`BIT_VECTOR`,`CAT_NAME`,`COLUMN_NAME`,`COLUMN_TYPE`,`DB_NAME`,`BIG_DECIMAL_HIGH_VALUE`,`BIG_DECIMAL_LOW_VALUE`,`DOUBLE_HIGH_VALUE`,`DOUBLE_LOW_VALUE`,`ENGINE`,`LAST_ANALYZED`,`LONG_HIGH_VALUE`,`LONG_LOW_VALUE`,`MAX_COL_LEN`,`NUM_DISTINCTS`,`NUM_FALSES`,`NUM_NULLS`,`NUM_TRUES`,`PART_ID`,`PARTITION_NAME`,`TABLE_NAME`) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)" failed : Column 'ENGINE' cannot be null
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.setPartitionColumnStatistics(ThriftHiveMetastore.java:648)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.setPartitionColumnStatistics(ThriftHiveMetastore.java:625)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.updatePartitionStatistics(ThriftHiveMetastore.java:605)
at io.prestosql.plugin.hive.metastore.thrift.BridgingHiveMetastore.updatePartitionStatistics(BridgingHiveMetastore.java:128)
at io.prestosql.plugin.hive.metastore.cache.CachingHiveMetastore.updatePartitionStatistics(CachingHiveMetastore.java:393)
at io.prestosql.plugin.hive.HiveMetastoreClosure.updatePartitionStatistics(HiveMetastoreClosure.java:99)
at io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore.lambda$setPartitionStatistics$9(SemiTransactionalHiveMetastore.java:358)
at com.google.common.collect.RegularImmutableMap.forEach(RegularImmutableMap.java:185)
at io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore.lambda$setPartitionStatistics$10(SemiTransactionalHiveMetastore.java:357)
at io.prestosql.plugin.hive.metastore.SemiTransactionalHiveMetastore.commit(SemiTransactionalHiveMetastore.java:969)
at io.prestosql.plugin.hive.HiveMetadata.commit(HiveMetadata.java:2614)
at io.prestosql.plugin.hive.HiveConnector.commit(HiveConnector.java:212)
at io.prestosql.transaction.InMemoryTransactionManager$TransactionMetadata$ConnectorTransactionMetadata.commit(InMemoryTransactionManager.java:594)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:80)
at io.prestosql.$gen.Presto_20f5dc9____20210714_120418_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Insert of object "org.apache.hadoop.hive.metastore.model.MPartitionColumnStatistics@3637c229" using statement "INSERT INTO `PART_COL_STATS` (`CS_ID`,`AVG_COL_LEN`,`BIT_VECTOR`,`CAT_NAME`,`COLUMN_NAME`,`COLUMN_TYPE`,`DB_NAME`,`BIG_DECIMAL_HIGH_VALUE`,`BIG_DECIMAL_LOW_VALUE`,`DOUBLE_HIGH_VALUE`,`DOUBLE_LOW_VALUE`,`ENGINE`,`LAST_ANALYZED`,`LONG_HIGH_VALUE`,`LONG_LOW_VALUE`,`MAX_COL_LEN`,`NUM_DISTINCTS`,`NUM_FALSES`,`NUM_NULLS`,`NUM_TRUES`,`PART_ID`,`PARTITION_NAME`,`TABLE_NAME`) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)" failed : Column 'ENGINE' cannot be null
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$update_partition_column_statistics_result$update_partition_column_statistics_resultStandardScheme.read(ThriftHiveMetastore.java)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$update_partition_column_statistics_result$update_partition_column_statistics_resultStandardScheme.read(ThriftHiveMetastore.java)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$update_partition_column_statistics_result.read(ThriftHiveMetastore.java)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_update_partition_column_statistics(ThriftHiveMetastore.java:3874)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.update_partition_column_statistics(ThriftHiveMetastore.java:3861)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastoreClient.setPartitionColumnStatistics(ThriftHiveMetastoreClient.java:240)
at io.prestosql.plugin.hive.metastore.thrift.FailureAwareThriftMetastoreClient.lambda$setPartitionColumnStatistics$17(FailureAwareThriftMetastoreClient.java:199)
at io.prestosql.plugin.hive.metastore.thrift.FailureAwareThriftMetastoreClient.runWithHandle(FailureAwareThriftMetastoreClient.java:422)
at io.prestosql.plugin.hive.metastore.thrift.FailureAwareThriftMetastoreClient.setPartitionColumnStatistics(FailureAwareThriftMetastoreClient.java:199)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.lambda$setPartitionColumnStatistics$28(ThriftHiveMetastore.java:640)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.setColumnStatistics(ThriftHiveMetastore.java:695)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.lambda$setPartitionColumnStatistics$29(ThriftHiveMetastore.java:636)
at io.prestosql.plugin.hive.metastore.thrift.ThriftMetastoreApiStats.lambda$wrap$0(ThriftMetastoreApiStats.java:42)
at io.prestosql.plugin.hive.util.RetryDriver.run(RetryDriver.java:130)
at io.prestosql.plugin.hive.metastore.thrift.ThriftHiveMetastore.setPartitionColumnStatistics(ThriftHiveMetastore.java:635)
... 21 more
my hive version is 3.1.3000.7.1.4.35-1 this looks related to https://github.com/trinodb/trino/issues/5155
@tooptoop4 I am also facing this issue with hive 3.1.3000.7.1.7. Were you able to resolve it
failed : Column 'ENGINE' cannot be null
still an issue
@tooptoop4 thanks for posting this led me to the timeout to get more info, then i found to turn off the column statistics
Added into the catalog file for the hive setup hive.metastore-timeout=60s hive.collect-column-statistics-on-write=false
This is now working: create table hivecatalog.demo.k1 as select * from hivecatalog.demo.test4a;
When the statistics are turned on, which is the default, i get this error message:
Insert of object "org.apache.hadoop.hive.metastore.model.MTableColumnStatistics@1592aa30" using statement "INSERT INTO TAB_COL_STATS
(CS_ID
,AVG_COL_LEN
,BIT_VECTOR
,CAT_NAME
,COLUMN_NAME
,COLUMN_TYPE
,DB_NAME
,BIG_DECIMAL_HIGH_VALUE
,BIG_DECIMAL_LOW_VALUE
,DOUBLE_HIGH_VALUE
,DOUBLE_LOW_VALUE
,ENGINE
,LAST_ANALYZED
,LONG_HIGH_VALUE
,LONG_LOW_VALUE
,MAX_COL_LEN
,NUM_DISTINCTS
,NUM_FALSES
,NUM_NULLS
,NUM_TRUES
,TBL_ID
,TABLE_NAME
) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)" failed : Column 'ENGINE' cannot be null
After the table is created, if i try to manually run the analyze table_name command, then I get the same error as what the create table produces
I'm not sure if this is a bug or working as expected?
@KevinAppelBofa I have found a workaround for this https://github.com/trinodb/trino-hive-apache/pull/43#issuecomment-1190060212
This should actually be possible to resolve now if Hive vendors cherry-pick https://github.com/apache/hive/pull/4984.
Thanks @zhangbutao for following up on Hive side.
I have a catalog connecting against hive/s3/parquet (kerberos protected HMS). SELECT * from any of the tables works. But ANALYZE statement (or CREATE TABLE AS ie a managed table) fails with error like the below. How to increase the timeout value? I wonder if the kerberos details aren't being supplied properly to the 'write' path of the HMS.
my hive catalog properties have:
hive.metastore.authentication.type=KERBEROS hive.metastore.thrift.impersonation.enabled=false hive.metastore.service.principal=hive/_HOST@THEDOMAINHERE hive.metastore.client.principal=myuser@THEDOMAINHERE hive.metastore.client.keytab=/etc/hive.keytab
looks related to https://github.com/trinodb/trino/issues/3624 I am using prestosql 336