I'm encountering an intermittent issue when using the s3.read_parquet_table function in my ETL pipeline. The pipeline reads Parquet files from S3 every 5 minutes (modin, ray, awswrangler). Occasionally, I receive the following error:
AWS Error NETWORK_CONNECTION during HeadObject operation: curlCode: 28, Timeout was reached
How to Reproduce
I am unable to reproduce this error consistently, and it seems to resolve itself after some time.
import awswrangler as wr
Describe the bug
Hi,
I'm encountering an intermittent issue when using the s3.read_parquet_table function in my ETL pipeline. The pipeline reads Parquet files from S3 every 5 minutes (modin, ray, awswrangler). Occasionally, I receive the following error:
AWS Error NETWORK_CONNECTION during HeadObject operation: curlCode: 28, Timeout was reached
How to Reproduce
I am unable to reproduce this error consistently, and it seems to resolve itself after some time. import awswrangler as wr
df = wr.s3.read_parquet_table(table,database,partition_filter, filename_suffix)
Expected behavior
No response
Your project
No response
Screenshots
No response
OS
Linux
Python version
3.10.13
AWS SDK for pandas version
3.7.2
Additional context
No response