delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
1.98k stars 365 forks source link

Generic S3 error: Content-Range header not present in partial response #2544

Open mervynzhang opened 1 month ago

mervynzhang commented 1 month ago

Environment

Delta-rs version: 0.17.4

Binding: python


Bug

What happened: We use s3proxy in k8s cluster to proxy s3 traffic. use delta-rs 0.14.0, all works fine. upgrade to 0.17.4 get following error:

---------------------------------------------------------------------------
DeltaError                                Traceback (most recent call last)
File <timed exec>:12

File /opt/conda/lib/python3.11/site-packages/deltalake/table.py:405, in DeltaTable.__init__(self, table_uri, version, storage_options, without_files, log_buffer_size)
    385 """
    386 Create the Delta Table from a path with an optional version.
    387 Multiple StorageBackends are currently supported: AWS S3, Azure Data Lake Storage Gen2, Google Cloud Storage (GCS) and local URI.
   (...)
    402 
    403 """
    404 self._storage_options = storage_options
--> 405 self._table = RawDeltaTable(
    406     str(table_uri),
    407     version=version,
    408     storage_options=storage_options,
    409     without_files=without_files,
    410     log_buffer_size=log_buffer_size,
    411 )

DeltaError: Failed to parse parquet: Parquet error: AsyncChunkReader::get_bytes error: Generic S3 error: Content-Range header not present in partial response

What you expected to happen:

How to reproduce it:

More details: