apache / hudi

Upserts, Deletes And Incremental Processing on Big Data.
https://hudi.apache.org/
Apache License 2.0
5.47k stars 2.43k forks source link

[BUG] Unable to execute HTTP request | connection timeout issues #8191

Open Khushbukela opened 1 year ago

Khushbukela commented 1 year ago

Describe the problem you faced Using hudi in the spark streaming job. Jobs are getting failed due to - HTTP connection timeout:

A clear and concise description of the problem. hudi version: 0.12 table type: COW ingestion mode: INSERT

Thanks for reading the issue, Need help to check if any other solutions can try or which behavior is more recommended to use in production.

To Reproduce

Steps to reproduce the behavior:

  1. start spark streaming job with COW table type , metadata enable and have multiple streaming queries [50+]

Expected behavior There seems to be a connection leaks issue in metadata with 0.12.2. A clear and concise description of what you expected to happen.

Environment Description

Additional context

Add any other context about the problem here.

Stacktrace


2023-03-14 07:04:14  WARN o.a.s.storag.BlockManager.logWarning (Logging.scala:73) [task 0.3 in stage 1115.0 (TID 2558)]: Putting block rdd_2569_0 failed due to exception org.apache.hudi.exception.HoodieException: Exception when reading log file .
2023-03-14 07:04:14  WARN o.a.s.storag.BlockManager.logWarning (Logging.scala:73) [task 0.3 in stage 1115.0 (TID 2558)]: Block rdd_2569_0 could not be removed as it was not found on disk or in memory
2023-03-14 07:04:14 ERROR o.a.s.execut.Executor.logError (Logging.scala:98) [task 0.3 in stage 1115.0 (TID 2558)]: Exception in task 0.3 in stage 1115.0 (TID 2558)
org.apache.hudi.exception.HoodieException: Exception when reading log file 

Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1216) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1162) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1372) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$4(S3AFileSystem.java:1289) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:285) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1286) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2223) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2203) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2142) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:715) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hudi.common.fs.HoodieWrapperFileSystem.open(HoodieWrapperFileSystem.java:195) ~[__app__.jar:?]
    at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:475) ~[__app__.jar:?]
    at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:114) ~[__app__.jar:?]
    at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:110) ~[__app__.jar:?]
    at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:223) ~[__app__.jar:?]
    ... 29 more
Caused by: com.amazonaws.thirdparty.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
    at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:316) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:282) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at sun.reflect.GeneratedMethodAccessor257.invoke(Unknown Source) ~[?:?]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_362]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_362]
    at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.conn.$Proxy51.get(Unknown Source) ~[?:?]
    at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1343) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1372) ~[aws-java-sdk-bundle-1.12.170.jar:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$4(S3AFileSystem.java:1289) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:285) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1286) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2223) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2203) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2142) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:715) ~[hadoop-aws-3.2.1-amzn-8.jar:?]
    at org.apache.hudi.common.fs.HoodieWrapperFileSystem.open(HoodieWrapperFileSystem.java:195) ~[__app__.jar:?]
    at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:475) ~[__app__.jar:?]
    at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:114) ~[__app__.jar:?]
    at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:110) ~[__app__.jar:?]
    at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:223) ~[__app__.jar:?]
    ... 29 more
nsivabalan commented 1 year ago

I see you are having issue w/ 0.12.0. Did you test it w/ 0.12.2. We did fix some connection leaks in both 0.12.2 and 0.13.0. We are planning for 0.12.3 shortly. So, if you can confirm that 0.12.2 is already in good shape, we should be good. if not, we will have to ensure 0.12.3 has the fix thats already in 0.13.0.

gauravkumar37 commented 1 year ago

We tried with 0.12.2 but it didn't help. However, 0.13 is good and not seeing such problems with it.

nsivabalan commented 1 year ago

got it thanks, will watch out for fixes that went into 0.13.0 around connection leaks and will pick those for 0.12.3. thanks for confirming.

Khushbukela commented 1 year ago

Hi @nsivabalan We have started seeing similar behaviour for 0.13 also. we are running 25-50 streaming queries per job. let me know if you need any other information which can help to debug this issue.

ad1happy2go commented 1 year ago

@Khushbukela Sorry for the delay for this ticket.

Can you provide how many files are there in the table? Can you also paste the write and table configs what you are using.

jlloh commented 1 year ago

Seeing something similar for Flink 1.16 with Hudi 0.13.1, COW insert with metadata enabled. Problem seems to occur ~4 hours after the job has been running. The job is an inline clustering job. After disabling metadata, the job is able to proceed.

Configurations:

    "table.table": "COPY_ON_WRITE"
    "write.operation": "insert"
    "write.insert.cluster": "true"
    "hoodie.datasource.write.hive_style_partitioning": "true"
    "hoodie.datasource.write.hive_style_partitioning": "true"
    "hoodie.parquet.max.file.size": "104857600"
    "hoodie.parquet.small.file.limit": "20971520"
    "clustering.plan.strategy.small.file.limit": "100"
    "metadata.enabled": "true"
    "hoodie.write.concurrency.mode": optimistic_concurrency_control
    "hoodie.cleaner.policy.failed.writes": LAZY
    "hoodie.write.lock.provider": org.apache.hudi.client.transaction.lock.InProcessLockProvider

Files: ~211 parquet files per partition across 4 hourly partitions when the issue started happening and the job failed to continue. The bucket assigner task is the one that hits this error. I have tried both hourly and daily partitions but both jobs seem to eventually fail and not able to recover when metadata is enabled.

Full stacktrace:

org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve files in partition s3a://<bucket_name>/folder_name/local_year=2023/local_month=07/local_day=08 from metadata
    at org.apache.hudi.metadata.BaseTableMetadata.getAllFilesInPartition(BaseTableMetadata.java:152)
    at org.apache.hudi.metadata.HoodieMetadataFileSystemView.listPartition(HoodieMetadataFileSystemView.java:69)
    at org.apache.hudi.common.table.view.AbstractTableFileSystemView.lambda$ensurePartitionLoadedCorrectly$16(AbstractTableFileSystemView.java:432)
    at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
    at org.apache.hudi.common.table.view.AbstractTableFileSystemView.ensurePartitionLoadedCorrectly(AbstractTableFileSystemView.java:423)
    at org.apache.hudi.common.table.view.AbstractTableFileSystemView.getLatestBaseFilesBeforeOrOn(AbstractTableFileSystemView.java:660)
    at org.apache.hudi.common.table.view.PriorityBasedFileSystemView.execute(PriorityBasedFileSystemView.java:104)
    at org.apache.hudi.common.table.view.PriorityBasedFileSystemView.getLatestBaseFilesBeforeOrOn(PriorityBasedFileSystemView.java:145)
    at org.apache.hudi.sink.partitioner.profile.WriteProfile.smallFilesProfile(WriteProfile.java:208)
    at org.apache.hudi.sink.partitioner.profile.WriteProfile.getSmallFiles(WriteProfile.java:191)
    at org.apache.hudi.sink.partitioner.BucketAssigner.getSmallFileAssign(BucketAssigner.java:179)
    at org.apache.hudi.sink.partitioner.BucketAssigner.addInsert(BucketAssigner.java:137)
    at org.apache.hudi.sink.partitioner.BucketAssignFunction.getNewRecordLocation(BucketAssignFunction.java:215)
    at org.apache.hudi.sink.partitioner.BucketAssignFunction.processRecord(BucketAssignFunction.java:200)
    at org.apache.hudi.sink.partitioner.BucketAssignFunction.processElement(BucketAssignFunction.java:162)
    at org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:83)
    at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233)
    at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
    at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
    at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:542)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:831)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:780)
    at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:914)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
    at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hudi.exception.HoodieException: Exception when reading log file 
    at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternalV1(AbstractHoodieLogRecordReader.java:374)
    at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:223)
    at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:198)
    at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:114)
    at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:73)
    at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:464)
    at org.apache.hudi.metadata.HoodieMetadataLogRecordReader$Builder.build(HoodieMetadataLogRecordReader.java:218)
    at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:546)
    at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:447)
    at org.apache.hudi.metadata.HoodieBackedTableMetadata.getOrCreateReaders(HoodieBackedTableMetadata.java:432)
    at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeys$3(HoodieBackedTableMetadata.java:239)
    at java.util.HashMap.forEach(HashMap.java:1290)
    at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:237)
    at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:152)
    at org.apache.hudi.metadata.BaseTableMetadata.fetchAllFilesInPartition(BaseTableMetadata.java:339)
    at org.apache.hudi.metadata.BaseTableMetadata.getAllFilesInPartition(BaseTableMetadata.java:150)
    ... 28 more
Caused by: org.apache.hudi.exception.HoodieIOException: unable to initialize read with log file 
    at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:113)
    at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternalV1(AbstractHoodieLogRecordReader.java:247)
    ... 43 more
Caused by: java.io.InterruptedIOException: getFileStatus on s3a://<redacted>/.hoodie/metadata/files/.files-0000_00000000000000.log.2_0-1-0: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
    at org.apache.hadoop.fs.s3a.S3AUtils.translateInterruptedException(S3AUtils.java:352)
    at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
    at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:151)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2278)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2226)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2160)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.open(S3AFileSystem.java:727)
    at org.apache.hudi.common.fs.HoodieWrapperFileSystem.open(HoodieWrapperFileSystem.java:203)
    at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:498)
    at org.apache.hudi.common.table.log.HoodieLogFileReader.<init>(HoodieLogFileReader.java:118)
    at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:110)
    ... 44 more
Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1216)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1162)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5445)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5392)
    at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1368)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$4(S3AFileSystem.java:1307)
    at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322)
    at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:285)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:1304)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2264)
    ... 51 more
Caused by: org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
    at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:316)
    at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:282)
    at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
    at com.amazonaws.http.conn.$Proxy56.get(Unknown Source)
    at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190)
    at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
    at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
    at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1343)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
    ... 66 more
danny0405 commented 1 year ago

Thanks, but for 0.14.0 we do many improvements to the MDT, let's see whethe the issue could be solved.

zbbkeepgoing commented 1 year ago

0.13.1 meet this issue too. COW table, about 7 million row. And we clustering it by target parquet size is 1024 MB.

If we run performance query test, some of query will be block by com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool

23/08/01 06:28:11 INFO AmazonHttpClient: Unable to execute HTTP request: Timeout waiting for connection from pool
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
        at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226)
        at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195)
        at jdk.internal.reflect.GeneratedMethodAccessor151.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.base/java.lang.reflect.Method.invoke(Unknown Source)
        at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
        at com.amazonaws.http.conn.$Proxy42.getConnection(Unknown Source)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:423)
        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
        at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:728)
        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785)
        at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1050)
        at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:904)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1553)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
        at org.apache.hudi.common.fs.HoodieWrapperFileSystem.lambda$getFileStatus$17(HoodieWrapperFileSystem.java:410)
        at org.apache.hudi.common.fs.HoodieWrapperFileSystem.executeFuncWithTimeMetrics(HoodieWrapperFileSystem.java:114)
        at org.apache.hudi.common.fs.HoodieWrapperFileSystem.getFileStatus(HoodieWrapperFileSystem.java:404)
        at org.apache.hudi.exception.TableNotFoundException.checkTableValidity(TableNotFoundException.java:51)
        at org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:137)
        at org.apache.hudi.common.table.HoodieTableMetaClient.newMetaClient(HoodieTableMetaClient.java:689)
        at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:81)
        at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:770)
        at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.<init>(AbstractHoodieLogRecordReader.java:165)
        at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:101)
        at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:73)
        at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:464)
        at org.apache.hudi.metadata.HoodieMetadataLogRecordReader$Builder.build(HoodieMetadataLogRecordReader.java:218)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:546)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:447)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeyPrefixes$7539c171$1(HoodieBackedTableMetadata.java:193)
        at org.apache.hudi.common.function.FunctionWrapper.lambda$throwingMapWrapper$0(FunctionWrapper.java:38)
        at org.apache.hudi.common.data.HoodieListData.lambda$flatMap$0(HoodieListData.java:124)
        at java.base/java.util.stream.ReferencePipeline$7$1.accept(Unknown Source)
        at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
        at java.base/java.util.stream.AbstractPipeline.copyInto(Unknown Source)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
        at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
        at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
        at java.base/java.util.stream.AbstractTask.compute(Unknown Source)
        at java.base/java.util.concurrent.CountedCompleter.exec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
23/08/01 06:28:11 INFO HoodieTableMetaClient: Loading HoodieTableMetaClient from s3a://xxxxxx/xxx/xxx/.hoodie/metadata
23/08/01 06:28:11 INFO AmazonHttpClient: Unable to execute HTTP request: Timeout waiting for connection from pool
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
        at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226)
        at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195)
        at jdk.internal.reflect.GeneratedMethodAccessor151.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.base/java.lang.reflect.Method.invoke(Unknown Source)
        at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
        at com.amazonaws.http.conn.$Proxy42.getConnection(Unknown Source)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:423)
        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
        at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:728)
        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785)
        at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1050)
        at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:904)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1553)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
        at org.apache.hudi.common.fs.HoodieWrapperFileSystem.lambda$getFileStatus$17(HoodieWrapperFileSystem.java:410)
        at org.apache.hudi.common.fs.HoodieWrapperFileSystem.executeFuncWithTimeMetrics(HoodieWrapperFileSystem.java:114)
        at org.apache.hudi.common.fs.HoodieWrapperFileSystem.getFileStatus(HoodieWrapperFileSystem.java:404)
        at org.apache.hudi.exception.TableNotFoundException.checkTableValidity(TableNotFoundException.java:51)
        at org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:137)
        at org.apache.hudi.common.table.HoodieTableMetaClient.newMetaClient(HoodieTableMetaClient.java:689)
        at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:81)
        at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:770)
        at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.<init>(AbstractHoodieLogRecordReader.java:165)
        at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:101)
        at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:73)
        at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:464)
        at org.apache.hudi.metadata.HoodieMetadataLogRecordReader$Builder.build(HoodieMetadataLogRecordReader.java:218)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:546)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:447)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeyPrefixes$7539c171$1(HoodieBackedTableMetadata.java:193)
        at org.apache.hudi.common.function.FunctionWrapper.lambda$throwingMapWrapper$0(FunctionWrapper.java:38)
        at org.apache.hudi.common.data.HoodieListData.lambda$flatMap$0(HoodieListData.java:124)
        at java.base/java.util.stream.ReferencePipeline$7$1.accept(Unknown Source)
        at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
        at java.base/java.util.stream.AbstractPipeline.copyInto(Unknown Source)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
        at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
        at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
        at java.base/java.util.stream.AbstractTask.compute(Unknown Source)
        at java.base/java.util.concurrent.CountedCompleter.exec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
23/08/01 06:28:11 INFO AmazonHttpClient: Unable to execute HTTP request: Timeout waiting for connection from pool
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
        at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226)
        at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195)
        at jdk.internal.reflect.GeneratedMethodAccessor151.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.base/java.lang.reflect.Method.invoke(Unknown Source)
        at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
        at com.amazonaws.http.conn.$Proxy42.getConnection(Unknown Source)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:423)
        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
        at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:728)
        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785)
        at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1050)
        at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:904)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1553)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
        at org.apache.hudi.common.fs.HoodieWrapperFileSystem.lambda$getFileStatus$17(HoodieWrapperFileSystem.java:410)
        at org.apache.hudi.common.fs.HoodieWrapperFileSystem.executeFuncWithTimeMetrics(HoodieWrapperFileSystem.java:114)
        at org.apache.hudi.common.fs.HoodieWrapperFileSystem.getFileStatus(HoodieWrapperFileSystem.java:404)
        at org.apache.hudi.exception.TableNotFoundException.checkTableValidity(TableNotFoundException.java:51)
        at org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:137)
        at org.apache.hudi.common.table.HoodieTableMetaClient.newMetaClient(HoodieTableMetaClient.java:689)
        at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:81)
        at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:770)
        at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.<init>(AbstractHoodieLogRecordReader.java:165)
        at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:101)
        at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:73)
        at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:464)
        at org.apache.hudi.metadata.HoodieMetadataLogRecordReader$Builder.build(HoodieMetadataLogRecordReader.java:218)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:546)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:447)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeyPrefixes$7539c171$1(HoodieBackedTableMetadata.java:193)
        at org.apache.hudi.common.function.FunctionWrapper.lambda$throwingMapWrapper$0(FunctionWrapper.java:38)
        at org.apache.hudi.common.data.HoodieListData.lambda$flatMap$0(HoodieListData.java:124)
        at java.base/java.util.stream.ReferencePipeline$7$1.accept(Unknown Source)
        at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
        at java.base/java.util.stream.AbstractPipeline.copyInto(Unknown Source)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
        at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
        at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
        at java.base/java.util.stream.AbstractTask.compute(Unknown Source)
        at java.base/java.util.concurrent.CountedCompleter.exec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
23/08/01 06:28:11 INFO AmazonHttpClient: Unable to execute HTTP request: Timeout waiting for connection from pool
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
        at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226)
        at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195)
        at jdk.internal.reflect.GeneratedMethodAccessor151.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.base/java.lang.reflect.Method.invoke(Unknown Source)
        at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
        at com.amazonaws.http.conn.$Proxy42.getConnection(Unknown Source)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:423)
        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
        at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:728)
        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3785)
        at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1050)
        at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:904)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1553)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)
        at org.apache.hudi.common.fs.HoodieWrapperFileSystem.lambda$getFileStatus$17(HoodieWrapperFileSystem.java:410)
        at org.apache.hudi.common.fs.HoodieWrapperFileSystem.executeFuncWithTimeMetrics(HoodieWrapperFileSystem.java:114)
        at org.apache.hudi.common.fs.HoodieWrapperFileSystem.getFileStatus(HoodieWrapperFileSystem.java:404)
        at org.apache.hudi.exception.TableNotFoundException.checkTableValidity(TableNotFoundException.java:51)
        at org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:137)
        at org.apache.hudi.common.table.HoodieTableMetaClient.newMetaClient(HoodieTableMetaClient.java:689)
        at org.apache.hudi.common.table.HoodieTableMetaClient.access$000(HoodieTableMetaClient.java:81)
        at org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:770)
        at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.<init>(AbstractHoodieLogRecordReader.java:165)
        at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:101)
        at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.<init>(HoodieMergedLogRecordScanner.java:73)
        at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner$Builder.build(HoodieMergedLogRecordScanner.java:464)
        at org.apache.hudi.metadata.HoodieMetadataLogRecordReader$Builder.build(HoodieMetadataLogRecordReader.java:218)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:546)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:447)
        at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeyPrefixes$7539c171$1(HoodieBackedTableMetadata.java:193)
        at org.apache.hudi.common.function.FunctionWrapper.lambda$throwingMapWrapper$0(FunctionWrapper.java:38)
        at org.apache.hudi.common.data.HoodieListData.lambda$flatMap$0(HoodieListData.java:124)
        at java.base/java.util.stream.ReferencePipeline$7$1.accept(Unknown Source)
        at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
        at java.base/java.util.stream.AbstractPipeline.copyInto(Unknown Source)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
        at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
        at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(Unknown Source)
        at java.base/java.util.stream.AbstractTask.compute(Unknown Source)
        at java.base/java.util.concurrent.CountedCompleter.exec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
danny0405 commented 1 year ago

Did you use Spark or Flink ?

zbbkeepgoing commented 1 year ago

Did you use Spark or Flink ?

Spark 3.3

kandyrise commented 1 year ago

While reading one of the hudi table, we started observing following warning and error, "... invalid or extra rollback command block in s3://...'

After the warning, the repeated error appears as "SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool"

The full log is attached.

The error seems to be arising everytime when the job runs. Looking for the root cause and possible solution.


Environment Description Hudi version : 0.12.3 Spark version : 3.3.1 Hadoop version : 3.3.3 Storage (HDFS/S3/GCS..) : S3 Running on Docker? (yes/no) : no EMR: 6.10.0


Default YARN executor launch context: env: CLASSPATH -> /usr/lib/hadoop-lzo/lib/:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/:/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/:/usr/share/aws/redshift/jdbc/RedshiftJDBC.jar:/usr/share/aws/redshift/spark-redshift/lib/:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar:/docker/usr/lib/hadoop-lzo/lib/:/docker/usr/lib/hadoop/hadoop-aws.jar:/docker/usr/share/aws/aws-java-sdk/:/docker/usr/share/aws/emr/goodies/lib/emr-spark-goodies.jar:/docker/usr/share/aws/emr/security/conf:/docker/usr/share/aws/emr/security/lib/:/docker/usr/share/aws/redshift/jdbc/RedshiftJDBC.jar:/docker/usr/share/aws/redshift/spark-redshift/lib/:/docker/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/docker/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/docker/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/docker/usr/share/aws/emr/s3select/lib/emr-s3-select-spark-connector.jar{{PWD}}{{PWD}}/spark_conf{{PWD}}/spark_libs/*{{PWD}}/spark_conf/hadoop_conf SPARK_YARN_CONTAINER_CORES -> 4 SPARK_YARN_STAGING_DIR -> hdfs://ip-10-160-41-177.cl.local:8020/user/hadoop/.sparkStaging/application_1694451048175_0001 SPARK_USER -> hadoop SPARK_PUBLIC_DNS -> ip-10-160-43-206.cl.local

command: LD_LIBRARY_PATH=\"/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native:$LD_LIBRARY_PATH\" \ {{JAVA_HOME}}/bin/java \ -server \ -Xmx51200m \ '-verbose:gc' \ '-XX:+PrintGCDetails' \ '-XX:+PrintGCDateStamps' \ '-XX:OnOutOfMemoryError=kill -9 %p' \ '-XX:+IgnoreUnrecognizedVMOptions' \ '--add-opens=java.base/java.lang=ALL-UNNAMED' \ '--add-opens=java.base/java.lang.invoke=ALL-UNNAMED' \ '--add-opens=java.base/java.lang.reflect=ALL-UNNAMED' \ '--add-opens=java.base/java.io=ALL-UNNAMED' \ '--add-opens=java.base/java.net=ALL-UNNAMED' \ '--add-opens=java.base/java.nio=ALL-UNNAMED' \ '--add-opens=java.base/java.util=ALL-UNNAMED' \ '--add-opens=java.base/java.util.concurrent=ALL-UNNAMED' \ '--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED' \ '--add-opens=java.base/sun.nio.ch=ALL-UNNAMED' \ '--add-opens=java.base/sun.nio.cs=ALL-UNNAMED' \ '--add-opens=java.base/sun.security.action=ALL-UNNAMED' \ '--add-opens=java.base/sun.util.calendar=ALL-UNNAMED' \ '--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED' \ '-verbose:gc' \ '-XX:+PrintGCDetails' \ '-XX:+PrintGCDateStamps' \ '-XX:OnOutOfMemoryError=kill -9 %p' \ '-XX:+UseParallelGC' \ '-XX:InitiatingHeapOccupancyPercent=70' \ -Djava.io.tmpdir={{PWD}}/tmp \ '-Dspark.driver.port=37441' \ '-Dspark.history.ui.port=18080' \ '-Dspark.ui.port=0' \ -Dspark.yarn.app.container.log.dir= \ "$(jar='/usr/share/log4j-cve-2021-44228-hotpatch/jdk17/Log4jHotPatchFat.jar'; [ -f "$jar" ] && echo "-javaagent:$jar=log4jFixerVerbose=false" || echo "" )" \ org.apache.spark.executor.YarnCoarseGrainedExecutorBackend \ --driver-url \ spark://CoarseGrainedScheduler@ip-10-160-43-206.cl.local:37441 \ --executor-id \

\ --hostname \ \ --cores \ 4 \ --app-id \ application_1694451048175_0001 \ --resourceProfileId \ 0 \ 1>/stdout \ 2>/stderr resources: hudi-defaults.conf -> resource { scheme: "hdfs" host: "ip-10-160-41-177.cl.local" port: 8020 file: "/user/hadoop/.sparkStaging/application_1694451048175_0001/hudi-defaults.conf" } size: 1465 timestamp: 1694451148605 type: FILE visibility: PRIVATE __app__.jar -> resource { scheme: "hdfs" host: "ip-10-160-41-177.cl.local" port: 8020 file: "/user/hadoop/.sparkStaging/application_1694451048175_0001/foo-bar-cat-2.0.jar" } size: 146938379 timestamp: 1694451148273 type: FILE visibility: PRIVATE __spark_conf__ -> resource { scheme: "hdfs" host: "ip-10-160-41-177.cl.local" port: 8020 file: "/user/hadoop/.sparkStaging/application_1694451048175_0001/__spark_conf__.zip" } size: 322494 timestamp: 1694451149096 type: ARCHIVE visibility: PRIVATE __spark_libs__ -> resource { scheme: "hdfs" host: "ip-10-160-41-177.cl.local" port: 8020 file: "/user/hadoop/.sparkStaging/application_1694451048175_0001/__spark_libs__442411574190127427.zip" } size: 332907488 timestamp: 1694451144663 type: ARCHIVE visibility: PRIVATE hive-site.xml -> resource { scheme: "hdfs" host: "ip-10-160-41-177.cl.local" port: 8020 file: "/user/hadoop/.sparkStaging/application_1694451048175_0001/hive-site.xml" } size: 2311 timestamp: 1694451148451 type: FILE visibility: PRIVATE =============================================================================== ... Use default timeout configuration to retry for read timeout com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool 23/09/11 17:21:49 ERROR AbstractHoodieLogRecordReader: Got exception when reading log file com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1219) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1165) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1372) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.call.GetObjectMetadataCall.perform(GetObjectMetadataCall.java:26) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.call.GetObjectMetadataCall.perform(GetObjectMetadataCall.java:12) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor$CallPerformer.call(GlobalS3Executor.java:111) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor.execute(GlobalS3Executor.java:138) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.invoke(AmazonS3LiteClient.java:191) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.invoke(AmazonS3LiteClient.java:186) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.getObjectMetadata(AmazonS3LiteClient.java:96) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.AbstractAmazonS3Lite.getObjectMetadata(AbstractAmazonS3Lite.java:43) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.getFileMetadataFromCacheOrS3(Jets3tNativeFileSystemStore.java:636) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:320) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.getFileStatus(S3NativeFileSystem.java:517) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:940) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:932) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.open(EmrFileSystem.java:192) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at org.apache.hudi.common.fs.HoodieWrapperFileSystem.open(HoodieWrapperFileSystem.java:195) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:475) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieLogFileReader.(HoodieLogFileReader.java:114) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:110) ~[__app__.jar:?] at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:223) ~[__app__.jar:?] at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:109) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.(HoodieMergedLogRecordScanner.java:102) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.(HoodieMetadataMergedLogRecordReader.java:63) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.(HoodieMetadataMergedLogRecordReader.java:51) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader$Builder.build(HoodieMetadataMergedLogRecordReader.java:230) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:516) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:429) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getOrCreateReaders(HoodieBackedTableMetadata.java:414) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeys$1(HoodieBackedTableMetadata.java:220) ~[__app__.jar:?] at java.util.HashMap.forEach(HashMap.java:1290) ~[?:1.8.0_382] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:218) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:149) ~[__app__.jar:?] at org.apache.hudi.metadata.BaseTableMetadata.fetchAllPartitionPaths(BaseTableMetadata.java:291) ~[__app__.jar:?] at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:117) ~[__app__.jar:?] at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:315) ~[__app__.jar:?] at org.apache.hudi.BaseHoodieTableFileIndex.getAllQueryPartitionPaths(BaseHoodieTableFileIndex.java:194) ~[__app__.jar:?] at org.apache.hudi.BaseHoodieTableFileIndex.loadPartitionPathFiles(BaseHoodieTableFileIndex.java:237) ~[__app__.jar:?] at org.apache.hudi.BaseHoodieTableFileIndex.doRefresh(BaseHoodieTableFileIndex.java:282) ~[__app__.jar:?] at org.apache.hudi.BaseHoodieTableFileIndex.(BaseHoodieTableFileIndex.java:147) ~[__app__.jar:?] at org.apache.hudi.SparkHoodieTableFileIndex.(SparkHoodieTableFileIndex.scala:73) ~[__app__.jar:?] at org.apache.hudi.HoodieFileIndex.(HoodieFileIndex.scala:81) ~[__app__.jar:?] at org.apache.hudi.HoodieBaseRelation.fileIndex$lzycompute(HoodieBaseRelation.scala:242) ~[__app__.jar:?] at org.apache.hudi.HoodieBaseRelation.fileIndex(HoodieBaseRelation.scala:240) ~[__app__.jar:?] at org.apache.hudi.BaseFileOnlyRelation.toHadoopFsRelation(BaseFileOnlyRelation.scala:153) ~[__app__.jar:?] at org.apache.hudi.DefaultSource$.resolveBaseFileOnlyRelation(DefaultSource.scala:268) ~[__app__.jar:?] at org.apache.hudi.DefaultSource$.createRelation(DefaultSource.scala:232) ~[__app__.jar:?] at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:111) ~[__app__.jar:?] at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:68) ~[__app__.jar:?] at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350) ~[spark-sql_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0] at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228) ~[spark-sql_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0] at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210) ~[spark-sql_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0] at scala.Option.getOrElse(Option.scala:189) ~[scala-library-2.12.15.jar:?] at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210) ~[spark-sql_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0] at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:185) ~[spark-sql_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0] .... Caused by: com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:314) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:280) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_382] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_382] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.conn.$Proxy67.get(Unknown Source) ~[?:?] at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1346) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157) ~[emrfs-hadoop-assembly-2.55.0.jar:?] ... 87 more 23/09/11 17:21:49 ERROR ApplicationMaster: User class threw exception: org.apache.hudi.exception.HoodieException: Error fetching partition paths from metadata table org.apache.hudi.exception.HoodieException: Error fetching partition paths from metadata table at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:317) ~[__app__.jar:?] at org.apache.hudi.BaseHoodieTableFileIndex.getAllQueryPartitionPaths(BaseHoodieTableFileIndex.java:194) ~[__app__.jar:?] at org.apache.hudi.BaseHoodieTableFileIndex.loadPartitionPathFiles(BaseHoodieTableFileIndex.java:237) ~[__app__.jar:?] at org.apache.hudi.BaseHoodieTableFileIndex.doRefresh(BaseHoodieTableFileIndex.java:282) ~[__app__.jar:?] at org.apache.hudi.BaseHoodieTableFileIndex.(BaseHoodieTableFileIndex.java:147) ~[__app__.jar:?] at org.apache.hudi.SparkHoodieTableFileIndex.(SparkHoodieTableFileIndex.scala:73) ~[__app__.jar:?] at org.apache.hudi.HoodieFileIndex.(HoodieFileIndex.scala:81) ~[__app__.jar:?] at org.apache.hudi.HoodieBaseRelation.fileIndex$lzycompute(HoodieBaseRelation.scala:242) ~[__app__.jar:?] at org.apache.hudi.HoodieBaseRelation.fileIndex(HoodieBaseRelation.scala:240) ~[__app__.jar:?] at org.apache.hudi.BaseFileOnlyRelation.toHadoopFsRelation(BaseFileOnlyRelation.scala:153) ~[__app__.jar:?] at org.apache.hudi.DefaultSource$.resolveBaseFileOnlyRelation(DefaultSource.scala:268) ~[__app__.jar:?] at org.apache.hudi.DefaultSource$.createRelation(DefaultSource.scala:232) ~[__app__.jar:?] at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:111) ~[__app__.jar:?] at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:68) ~[__app__.jar:?] at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350) ~[spark-sql_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0] at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228) ~[spark-sql_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0] at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210) ~[spark-sql_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0] at scala.Option.getOrElse(Option.scala:189) ~[scala-library-2.12.15.jar:?] at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210) ~[spark-sql_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0] at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:185) ~[spark-sql_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0] ... at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_382] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_382] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_382] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_382] at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:742) ~[spark-yarn_2.12-3.3.1-amzn-0.jar:3.3.1-amzn-0] Caused by: org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list of partition from metadata at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:119) ~[__app__.jar:?] at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:315) ~[__app__.jar:?] ... 42 more Caused by: org.apache.hudi.exception.HoodieException: Exception when reading log file at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:352) ~[__app__.jar:?] at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:109) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.(HoodieMergedLogRecordScanner.java:102) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.(HoodieMetadataMergedLogRecordReader.java:63) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.(HoodieMetadataMergedLogRecordReader.java:51) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader$Builder.build(HoodieMetadataMergedLogRecordReader.java:230) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:516) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:429) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getOrCreateReaders(HoodieBackedTableMetadata.java:414) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeys$1(HoodieBackedTableMetadata.java:220) ~[__app__.jar:?] at java.util.HashMap.forEach(HashMap.java:1290) ~[?:1.8.0_382] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:218) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:149) ~[__app__.jar:?] at org.apache.hudi.metadata.BaseTableMetadata.fetchAllPartitionPaths(BaseTableMetadata.java:291) ~[__app__.jar:?] at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:117) ~[__app__.jar:?] at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:315) ~[__app__.jar:?] ... 42 more Caused by: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1219) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1165) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1372) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.call.GetObjectMetadataCall.perform(GetObjectMetadataCall.java:26) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.call.GetObjectMetadataCall.perform(GetObjectMetadataCall.java:12) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor$CallPerformer.call(GlobalS3Executor.java:111) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor.execute(GlobalS3Executor.java:138) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.invoke(AmazonS3LiteClient.java:191) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.invoke(AmazonS3LiteClient.java:186) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.getObjectMetadata(AmazonS3LiteClient.java:96) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.AbstractAmazonS3Lite.getObjectMetadata(AbstractAmazonS3Lite.java:43) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.getFileMetadataFromCacheOrS3(Jets3tNativeFileSystemStore.java:636) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:320) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.getFileStatus(S3NativeFileSystem.java:517) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:940) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:932) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.open(EmrFileSystem.java:192) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at org.apache.hudi.common.fs.HoodieWrapperFileSystem.open(HoodieWrapperFileSystem.java:195) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:475) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieLogFileReader.(HoodieLogFileReader.java:114) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:110) ~[__app__.jar:?] at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:223) ~[__app__.jar:?] at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:109) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.(HoodieMergedLogRecordScanner.java:102) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.(HoodieMetadataMergedLogRecordReader.java:63) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.(HoodieMetadataMergedLogRecordReader.java:51) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader$Builder.build(HoodieMetadataMergedLogRecordReader.java:230) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:516) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:429) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getOrCreateReaders(HoodieBackedTableMetadata.java:414) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeys$1(HoodieBackedTableMetadata.java:220) ~[__app__.jar:?] at java.util.HashMap.forEach(HashMap.java:1290) ~[?:1.8.0_382] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:218) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:149) ~[__app__.jar:?] at org.apache.hudi.metadata.BaseTableMetadata.fetchAllPartitionPaths(BaseTableMetadata.java:291) ~[__app__.jar:?] at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:117) ~[__app__.jar:?] at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:315) ~[__app__.jar:?] ... 42 more Caused by: com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:314) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:280) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_382] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_382] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.conn.$Proxy67.get(Unknown Source) ~[?:?] at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1346) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1372) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.call.GetObjectMetadataCall.perform(GetObjectMetadataCall.java:26) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.call.GetObjectMetadataCall.perform(GetObjectMetadataCall.java:12) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor$CallPerformer.call(GlobalS3Executor.java:111) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor.execute(GlobalS3Executor.java:138) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.invoke(AmazonS3LiteClient.java:191) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.invoke(AmazonS3LiteClient.java:186) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.getObjectMetadata(AmazonS3LiteClient.java:96) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3.lite.AbstractAmazonS3Lite.getObjectMetadata(AbstractAmazonS3Lite.java:43) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.getFileMetadataFromCacheOrS3(Jets3tNativeFileSystemStore.java:636) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:320) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.getFileStatus(S3NativeFileSystem.java:517) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:940) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:932) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.open(EmrFileSystem.java:192) ~[emrfs-hadoop-assembly-2.55.0.jar:?] at org.apache.hudi.common.fs.HoodieWrapperFileSystem.open(HoodieWrapperFileSystem.java:195) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:475) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieLogFileReader.(HoodieLogFileReader.java:114) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:110) ~[__app__.jar:?] at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:223) ~[__app__.jar:?] at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:109) ~[__app__.jar:?] at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.(HoodieMergedLogRecordScanner.java:102) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.(HoodieMetadataMergedLogRecordReader.java:63) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.(HoodieMetadataMergedLogRecordReader.java:51) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader$Builder.build(HoodieMetadataMergedLogRecordReader.java:230) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:516) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:429) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getOrCreateReaders(HoodieBackedTableMetadata.java:414) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeys$1(HoodieBackedTableMetadata.java:220) ~[__app__.jar:?] at java.util.HashMap.forEach(HashMap.java:1290) ~[?:1.8.0_382] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:218) ~[__app__.jar:?] at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:149) ~[__app__.jar:?] at org.apache.hudi.metadata.BaseTableMetadata.fetchAllPartitionPaths(BaseTableMetadata.java:291) ~[__app__.jar:?] at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:117) ~[__app__.jar:?] at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:315) ~[__app__.jar:?] ... 42 more 23/09/11 17:21:49 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.hudi.exception.HoodieException: Error fetching partition paths from metadata table at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:317) at org.apache.hudi.BaseHoodieTableFileIndex.getAllQueryPartitionPaths(BaseHoodieTableFileIndex.java:194) at org.apache.hudi.BaseHoodieTableFileIndex.loadPartitionPathFiles(BaseHoodieTableFileIndex.java:237) at org.apache.hudi.BaseHoodieTableFileIndex.doRefresh(BaseHoodieTableFileIndex.java:282) at org.apache.hudi.BaseHoodieTableFileIndex.(BaseHoodieTableFileIndex.java:147) at org.apache.hudi.SparkHoodieTableFileIndex.(SparkHoodieTableFileIndex.scala:73) at org.apache.hudi.HoodieFileIndex.(HoodieFileIndex.scala:81) at org.apache.hudi.HoodieBaseRelation.fileIndex$lzycompute(HoodieBaseRelation.scala:242) at org.apache.hudi.HoodieBaseRelation.fileIndex(HoodieBaseRelation.scala:240) at org.apache.hudi.BaseFileOnlyRelation.toHadoopFsRelation(BaseFileOnlyRelation.scala:153) at org.apache.hudi.DefaultSource$.resolveBaseFileOnlyRelation(DefaultSource.scala:268) at org.apache.hudi.DefaultSource$.createRelation(DefaultSource.scala:232) at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:111) at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:68) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228) at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:185) ... Caused by: org.apache.hudi.exception.HoodieMetadataException: Failed to retrieve list of partition from metadata at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:119) at org.apache.hudi.common.fs.FSUtils.getAllPartitionPaths(FSUtils.java:315) ... 42 more Caused by: org.apache.hudi.exception.HoodieException: Exception when reading log file at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:352) at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scan(AbstractHoodieLogRecordReader.java:192) at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.performScan(HoodieMergedLogRecordScanner.java:109) at org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner.(HoodieMergedLogRecordScanner.java:102) at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.(HoodieMetadataMergedLogRecordReader.java:63) at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader.(HoodieMetadataMergedLogRecordReader.java:51) at org.apache.hudi.metadata.HoodieMetadataMergedLogRecordReader$Builder.build(HoodieMetadataMergedLogRecordReader.java:230) at org.apache.hudi.metadata.HoodieBackedTableMetadata.getLogRecordScanner(HoodieBackedTableMetadata.java:516) at org.apache.hudi.metadata.HoodieBackedTableMetadata.openReaders(HoodieBackedTableMetadata.java:429) at org.apache.hudi.metadata.HoodieBackedTableMetadata.getOrCreateReaders(HoodieBackedTableMetadata.java:414) at org.apache.hudi.metadata.HoodieBackedTableMetadata.lambda$getRecordsByKeys$1(HoodieBackedTableMetadata.java:220) at java.util.HashMap.forEach(HashMap.java:1290) at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordsByKeys(HoodieBackedTableMetadata.java:218) at org.apache.hudi.metadata.HoodieBackedTableMetadata.getRecordByKey(HoodieBackedTableMetadata.java:149) at org.apache.hudi.metadata.BaseTableMetadata.fetchAllPartitionPaths(BaseTableMetadata.java:291) at org.apache.hudi.metadata.BaseTableMetadata.getAllPartitionPaths(BaseTableMetadata.java:117) ... 43 more Caused by: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1219) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1165) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1372) at com.amazon.ws.emr.hadoop.fs.s3.lite.call.GetObjectMetadataCall.perform(GetObjectMetadataCall.java:26) at com.amazon.ws.emr.hadoop.fs.s3.lite.call.GetObjectMetadataCall.perform(GetObjectMetadataCall.java:12) at com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor$CallPerformer.call(GlobalS3Executor.java:111) at com.amazon.ws.emr.hadoop.fs.s3.lite.executor.GlobalS3Executor.execute(GlobalS3Executor.java:138) at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.invoke(AmazonS3LiteClient.java:191) at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.invoke(AmazonS3LiteClient.java:186) at com.amazon.ws.emr.hadoop.fs.s3.lite.AmazonS3LiteClient.getObjectMetadata(AmazonS3LiteClient.java:96) at com.amazon.ws.emr.hadoop.fs.s3.lite.AbstractAmazonS3Lite.getObjectMetadata(AbstractAmazonS3Lite.java:43) at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.getFileMetadataFromCacheOrS3(Jets3tNativeFileSystemStore.java:636) at com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:320) at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.getFileStatus(S3NativeFileSystem.java:517) at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:940) at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:932) at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.open(EmrFileSystem.java:192) at org.apache.hudi.common.fs.HoodieWrapperFileSystem.open(HoodieWrapperFileSystem.java:195) at org.apache.hudi.common.table.log.HoodieLogFileReader.getFSDataInputStream(HoodieLogFileReader.java:475) at org.apache.hudi.common.table.log.HoodieLogFileReader.(HoodieLogFileReader.java:114) at org.apache.hudi.common.table.log.HoodieLogFormatReader.hasNext(HoodieLogFormatReader.java:110) at org.apache.hudi.common.table.log.AbstractHoodieLogRecordReader.scanInternal(AbstractHoodieLogRecordReader.java:223) ... 58 more Caused by: com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:314) at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:280) at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.conn.$Proxy67.get(Unknown Source) at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190) at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at com.amazon.ws.emr.hadoop.fs.shaded.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1346) at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157) ... 87 more ) [1.ReadTimeout.To.HudiSupport.log](https://github.com/apache/hudi/files/12593508/1.ReadTimeout.To.HudiSupport.log)
ad1happy2go commented 1 year ago

@kandyrise Are you setting fs.s3a.connection.maximum ?

kandyrise commented 1 year ago

It is not being set. Let me try that

ad1happy2go commented 1 year ago

@kandyrise Were you able to fix it after setting that config?

kandyrise commented 1 year ago

yes, setting the config 'fs.s3a.connection.maximum' fixed the error. thx