Open osy497 opened 2 months ago
Hi @osy497 , This question is quite vague. Could you please provide some stack trace? If the issue is related to dataFiles()
and involves an IOException
, it might be failing while closing the stream.
@nk1506 I got something like these:
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: Remote backend is unreachable (ConcurrentModification: concurrent modification) (Service: S3, Status Code: 400, Request ID: 17E4744B40A060BB)
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156) ~[test-app.jar:?]
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108) ~[test-app.jar:?]
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85) ~[test-app.jar:?]
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43) ~[test-app.jar:?]
at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:50) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:38) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:72) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:55) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:39) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:81) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:224) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:173) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:80) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74) ~[test-app.jar:?]
at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45) ~[test-app.jar:?]
at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53) ~[test-app.jar:?]
at software.amazon.awssdk.services.s3.DefaultS3Client.putObject(DefaultS3Client.java:10191) ~[test-app.jar:?]
at org.apache.iceberg.aws.s3.S3OutputStream.completeUploads(S3OutputStream.java:438) ~[test-app.jar:?]
at org.apache.iceberg.aws.s3.S3OutputStream.close(S3OutputStream.java:265) ~[test-app.jar:?]
at org.apache.parquet.io.DelegatingPositionOutputStream.close(DelegatingPositionOutputStream.java:38) ~[test-app.jar:?]
at org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:1204) ~[test-app.jar:?]
at org.apache.iceberg.parquet.ParquetWriter.close(ParquetWriter.java:257) ~[test-app.jar:?]
at org.apache.iceberg.io.DataWriter.close(DataWriter.java:82) ~[test-app.jar:?]
at org.apache.iceberg.io.BaseTaskWriter$BaseRollingWriter.closeCurrent(BaseTaskWriter.java:314) ~[test-app.jar:?]
at org.apache.iceberg.io.BaseTaskWriter$BaseRollingWriter.close(BaseTaskWriter.java:341) ~[test-app.jar:?]
at org.apache.iceberg.io.PartitionedFanoutWriter.close(PartitionedFanoutWriter.java:70) ~[test-app.jar:?]
at org.apache.iceberg.io.BaseTaskWriter.complete(BaseTaskWriter.java:96) ~[test-app.jar:?]
at org.apache.iceberg.io.TaskWriter.dataFiles(TaskWriter.java:50) ~[test-app.jar:?]
...
Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 1 failure: Unable to execute HTTP request: Read timed out
or
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null (Service: S3, Status Code: 400, Request ID: null)
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156) ~[test-app.jar:?]
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108) ~[test-app.jar:?]
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85) ~[test-app.jar:?]
at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43) ~[test-app.jar:?]
at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:50) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:38) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:72) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:55) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:39) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:81) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:224) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:173) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:80) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182) ~[test-app.jar:?]
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74) ~[test-app.jar:?]
at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45) ~[test-app.jar:?]
at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53) ~[test-app.jar:?]
at software.amazon.awssdk.services.s3.DefaultS3Client.putObject(DefaultS3Client.java:10191) ~[test-app.jar:?]
at org.apache.iceberg.aws.s3.S3OutputStream.completeUploads(S3OutputStream.java:438) ~[test-app.jar:?]
at org.apache.iceberg.aws.s3.S3OutputStream.close(S3OutputStream.java:265) ~[test-app.jar:?]
at org.apache.parquet.io.DelegatingPositionOutputStream.close(DelegatingPositionOutputStream.java:38) ~[test-app.jar:?]
at org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:1204) ~[test-app.jar:?]
at org.apache.iceberg.parquet.ParquetWriter.close(ParquetWriter.java:257) ~[test-app.jar:?]
at org.apache.iceberg.io.DataWriter.close(DataWriter.java:82) ~[test-app.jar:?]
at org.apache.iceberg.io.BaseTaskWriter$BaseRollingWriter.closeCurrent(BaseTaskWriter.java:314) ~[test-app.jar:?]
at org.apache.iceberg.io.BaseTaskWriter$BaseRollingWriter.close(BaseTaskWriter.java:341) ~[test-app.jar:?]
at org.apache.iceberg.io.PartitionedFanoutWriter.close(PartitionedFanoutWriter.java:70) ~[test-app.jar:?]
at org.apache.iceberg.io.BaseTaskWriter.complete(BaseTaskWriter.java:96) ~[test-app.jar:?]
at org.apache.iceberg.io.TaskWriter.dataFiles(TaskWriter.java:50) ~[test-app.jar:?]
...
Hi @osy497 , as per stacktrace the error is off type 400(BAD_REQUEST)
. I don't think above errors are re-triable errors.
@nk1506
Could you elaborate on what happens if I retry dataFiles()
when the above exception is thrown?
(Additionally, I am using minio
for s3 proxy.)
@osy497 , As I can see in the description you are re-trying after rewriting everything. Since this error is coming when writer has completed the operation and S3client is not able to upload the same file. If error is related to connection time-out or similar retry should help. But here it seems it is throwing BAD_REQUEST
. By any chance did you check with community on slack ?
@nk1506 Most of cases seems timeout problem, but i'm not sure about that. I will ask for this in Slack channel later. Thanks for your explanation :)
Is this an AWS s3 store? I don't see the extended request IDs in the stack trace you get from there...
Query engine
JAVA API
Question
We have been trying to store our data into Iceberg table with version
1.5.2
of Iceberg.Now, we are using
Rest catalog
,s3FileIO
, andParquet
as data format, and the related code to flush the writer is following logic:The above flush code works fine for the most case, but the
dataFiles()
code sometimes fails with an exception due to a timeout or something.When this happens, we are currently writing the entire data into writer again and flushing it again, which I think is a huge overhead.
To avoid this, we would like to add retry logic to the dataFiles if the
dataFiles()
method is retryable.For example, if in
dataFiles()
, part of the data in the writer buffer succeeds and part fails, will there be a problem with retrying?Your answer would be appreciated.