prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
16.06k stars 5.38k forks source link

presto failed to connect hive query table from s3 #18927

Open zhangzhaohuazai opened 1 year ago

zhangzhaohuazai commented 1 year ago

*presto:users_cdc_0301s> select count() from mysql_cdc_sync_hive_0301s_ro; Query 20230112_094313_00001_q532x failed: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found。** I can query this s3 table normally in hive。

Full stack information:

Query 20230113_124402_00001_c9r2p failed: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2197) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654) at org.apache.hadoop.fs.PrestoFileSystemCache.createFileSystem(PrestoFileSystemCache.java:144) at org.apache.hadoop.fs.PrestoFileSystemCache.getInternal(PrestoFileSystemCache.java:89) at org.apache.hadoop.fs.PrestoFileSystemCache.get(PrestoFileSystemCache.java:62) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295) at org.apache.hudi.common.fs.FSUtils.getFs(FSUtils.java:107) at org.apache.hudi.common.table.HoodieTableMetaClient.getFs(HoodieTableMetaClient.java:294) at org.apache.hudi.common.table.HoodieTableMetaClient.(HoodieTableMetaClient.java:127) at org.apache.hudi.common.table.HoodieTableMetaClient.newMetaClient(HoodieTableMetaClient.java:641) at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:80) at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:710) at com.facebook.presto.hive.HudiDirectoryLister.(HudiDirectoryLister.java:61) at com.facebook.presto.hive.StoragePartitionLoader.(StoragePartitionLoader.java:143) at com.facebook.presto.hive.DelegatingPartitionLoader.(DelegatingPartitionLoader.java:54) at com.facebook.presto.hive.BackgroundHiveSplitLoader.(BackgroundHiveSplitLoader.java:90) at com.facebook.presto.hive.HiveSplitManager.getSplits(HiveSplitManager.java:293) at com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorSplitManager.getSplits(ClassLoaderSafeConnectorSplitManager.java:41) at com.facebook.presto.split.SplitManager.getSplits(SplitManager.java:89) at com.facebook.presto.split.CloseableSplitSourceProvider.getSplits(CloseableSplitSourceProvider.java:52) at com.facebook.presto.sql.planner.SplitSourceFactory$Visitor.lambda$visitTableScan$0(SplitSourceFactory.java:157) at com.facebook.presto.sql.planner.LazySplitSource.getDelegate(LazySplitSource.java:96) at com.facebook.presto.sql.planner.LazySplitSource.getConnectorId(LazySplitSource.java:48) at com.facebook.presto.execution.scheduler.SectionExecutionFactory.createStageScheduler(SectionExecutionFactory.java:281) at com.facebook.presto.execution.scheduler.SectionExecutionFactory.createStreamingLinkedStageExecutions(SectionExecutionFactory.java:243) at com.facebook.presto.execution.scheduler.SectionExecutionFactory.createStreamingLinkedStageExecutions(SectionExecutionFactory.java:221) at com.facebook.presto.execution.scheduler.SectionExecutionFactory.createSectionExecutions(SectionExecutionFactory.java:167) at com.facebook.presto.execution.scheduler.LegacySqlQueryScheduler.createStageExecutions(LegacySqlQueryScheduler.java:353) at com.facebook.presto.execution.scheduler.LegacySqlQueryScheduler.(LegacySqlQueryScheduler.java:242) at com.facebook.presto.execution.scheduler.LegacySqlQueryScheduler.createSqlQueryScheduler(LegacySqlQueryScheduler.java:171) at com.facebook.presto.execution.SqlQueryExecution.planDistribution(SqlQueryExecution.java:564) at com.facebook.presto.execution.SqlQueryExecution.start(SqlQueryExecution.java:411) at com.facebook.presto.$gen.Presto_0_278_1_ec67ba1____20230113_124308_1.run(Unknown Source) at com.facebook.presto.execution.SqlQueryManager.createQuery(SqlQueryManager.java:286) at com.facebook.presto.dispatcher.LocalDispatchQuery.lambda$startExecution$8(LocalDispatchQuery.java:197) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195) ... 38 more

zhangzhaohuazai commented 1 year ago

this is my hive.properties at */presto/etc/catalog image

zhangzhaohuazai commented 1 year ago

presto version:0.278.1 hive version:3.1.3 hadoop version:3.2.4

agrawalreetika commented 1 year ago

This is reported as part of this - https://github.com/prestodb/presto/issues/18911

imjalpreet commented 1 year ago

@zhangzhaohuazai This was a known issue in Presto when trying to access HUDI tables until v0.278(starting from 0.276). A part of this is fixed in 0.279. If you are trying to access HUDI COW or MOR read-optimised tables, you can try using presto 0.279 and let us know if you are still facing any issue.

The fix for HUDI MOR realtime tables will be added soon and should most probably be available with the next release.

pratyakshsharma commented 1 year ago

pinging to be in loop!

SamRaza356 commented 4 months ago

How to solve this error any solution: hadoop: 3.3.62 hive: 3.1.3 presto: 0.28.7

hiv.properties: cat <> $HIVE_PROPERTIES_FILE

hive.s3.aws-access-key=$AWS_ACCESS_KEY hive.s3.aws-secret-key=$AWS_SECRET_KEY

core-site.xml: cat <> $CORE_SITE_XML

fs.s3a.access.key $AWS_ACCESS_KEY AWS access key ID. Omit for IAM role-based or provider-based authentication. fs.s3a.secret.key $AWS_SECRET_KEY AWS secret key. Omit for IAM role-based or provider-based authentication. fs.s3a.aws.credentials.provider org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider Comma-separated class names of credential provider classes which implement com.amazonaws.auth.AWSCredentialsProvider. These are loaded and queried in sequence for a valid set of credentials. Each listed class must implement one of the following means of construction, which are attempted in order: 1. a public constructor accepting java.net.URI and org.apache.hadoop.conf.Configuration, 2. a public static method named getInstance that accepts no arguments and returns an instance of com.amazonaws.auth.AWSCredentialsProvider, or 3. a public default constructor. Specifying org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider allows anonymous access to a publicly accessible S3 bucket without any credentials. Please note that allowing anonymous access to an S3 bucket compromises security and therefore is unsuitable for most use cases. It can be useful for accessing public data sets without requiring AWS credentials. If unspecified, then the default list of credential provider classes, queried in sequence, is: 1. org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider: Uses the values of fs.s3a.access.key and fs.s3a.secret.key. 2. com.amazonaws.auth.EnvironmentVariableCredentialsProvider: supports configuration of AWS access key ID and secret access key in environment variables named AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, as documented in the AWS SDK. 3. com.amazonaws.auth.InstanceProfileCredentialsProvider: supports use of instance profile credentials if running in an EC2 VM. fs.s3a.session.token Session token, when using org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider as one of the providers. image

Stack Trace: com.facebook.presto.spi.PrestoException: Got exception: java.net.SocketTimeoutException getFileStatus on s3a://gdpr-test-raw-data/testing: org.apache.hadoop.fs.s3a.auth.NoAwsCredentialsException: IAMInstanceCredentialsProvider: Failed to connect to service endpoint: at com.facebook.presto.hive.metastore.thrift.ThriftHiveMetastore.createTable(ThriftHiveMetastore.java:1027) at com.facebook.presto.hive.metastore.thrift.BridgingHiveMetastore.createTable(BridgingHiveMetastore.java:200) at com.facebook.presto.hive.metastore.AbstractCachingHiveMetastore.createTable(AbstractCachingHiveMetastore.java:93) at com.facebook.presto.hive.metastore.AbstractCachingHiveMetastore.createTable(AbstractCachingHiveMetastore.java:93) at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore$CreateTableOperation.run(SemiTransactionalHiveMetastore.java:2778) at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore$Committer.executeAddTableOperations(SemiTransactionalHiveMetastore.java:1632) at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore$Committer.access$1000(SemiTransactionalHiveMetastore.java:1288) at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.commitShared(SemiTransactionalHiveMetastore.java:1223) at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.commit(SemiTransactionalHiveMetastore.java:1100) at com.facebook.presto.hive.HiveMetadata.commit(HiveMetadata.java:3679) at com.facebook.presto.hive.HiveConnector.commit(HiveConnector.java:234) at com.facebook.presto.transaction.InMemoryTransactionManager$TransactionMetadata$ConnectorTransactionMetadata.commit(InMemoryTransactionManager.java:720) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at com.facebook.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Got exception: java.net.SocketTimeoutException getFileStatus on s3a://gdpr-test-raw-data/testing: org.apache.hadoop.fs.s3a.auth.NoAwsCredentialsException: IAMInstanceCredentialsProvider: Failed to connect to service endpoint: at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_result$create_table_resultStandardScheme.read(ThriftHiveMetastore.java:52658) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_result$create_table_resultStandardScheme.read(ThriftHiveMetastore.java:52626) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_table_result.read(ThriftHiveMetastore.java:52552) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table(ThriftHiveMetastore.java:1490) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table(ThriftHiveMetastore.java:1477) at com.facebook.presto.hive.metastore.thrift.ThriftHiveMetastoreClient.createTable(ThriftHiveMetastoreClient.java:151) at com.facebook.presto.hive.metastore.thrift.ThriftHiveMetastore.lambda$null$64(ThriftHiveMetastore.java:966) at com.facebook.presto.hive.metastore.thrift.ThriftHiveMetastore.getMetastoreClientThenCall(ThriftHiveMetastore.java:1203) at com.facebook.presto.hive.metastore.thrift.ThriftHiveMetastore.lambda$createTable$65(ThriftHiveMetastore.java:965) at com.facebook.presto.hive.metastore.thrift.HiveMetastoreApiStats.lambda$wrap$0(HiveMetastoreApiStats.java:48) at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:139) at com.facebook.presto.hive.metastore.thrift.ThriftHiveMetastore.createTable(ThriftHiveMetastore.java:1018) ... 18 more