version: 0
schema: Schema([Field(id, PrimitiveType("string"), nullable=True), Field(path, PrimitiveType("string"), nullable=True)])
['0-e03dac34-16a0-4b6e-82c8-fd1098d1bf45-0.parquet']
Traceback (most recent call last):
File "test.py", line 32, in <module>
df = table.to_pyarrow_table()
File "***/lib/python3.10/site-packages/deltalake/table.py", line 1161, in to_pyarrow_table
return self.to_pyarrow_dataset(
File "pyarrow/_dataset.pyx", line 562, in pyarrow._dataset.Dataset.to_table
File "pyarrow/_dataset.pyx", line 3804, in pyarrow._dataset.Scanner.to_table
File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 88, in pyarrow.lib.check_status
OSError: Generic S3 error: error decoding response body
Stack shows that this is actually in pyarrow. Not sure if it possible to tweak pyarrow's behavior with S3 from deltalake.
What you expected to happen:
I can get the pyarrow table.
How to reproduce it:
More details:
I have verified the integrity of this table with these methods:
Cloning the table locally, then load from there. to_pyarrow_table() runs fine.
Reading the S3 table with duckdb (and its delta extension). Worked fine, too.
Environment
Delta-rs version:
deltalake==0.18.1
Binding: Python
Environment:
pyarrow==16.1.0
pyarrow-hotfix==0.6
Bug
What happened:
Trying to do a simple table loading from S3, but kept getting this
OSError: Generic S3 error: error decoding response body
Stack shows that this is actually in
pyarrow
. Not sure if it possible to tweakpyarrow
's behavior with S3 fromdeltalake
.What you expected to happen:
I can get the pyarrow table.
How to reproduce it:
More details:
I have verified the integrity of this table with these methods:
to_pyarrow_table()
runs fine.duckdb
(and itsdelta
extension). Worked fine, too.