delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
1.97k stars 365 forks source link

AsyncChunkReader::get_bytes error: Generic MicrosoftAzure error: error decoding response body #2592

Open thomasfrederikhoeck opened 2 weeks ago

thomasfrederikhoeck commented 2 weeks ago

Environment

Delta-rs version: 0.18.1

Binding: Python

Environment:


Bug

What happened: After 0.18.1 was released it fixes the inital issue with #2301 for me but instead I started hitting this. The Z-order operations start and I can see that there is usage of network, CPU and memory but after 30 secs-ish I'm hit with the following. The Rust logs doens't show anything strange:

metrics = self.table._table.z_order_optimize(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_internal.DeltaError: Failed to parse parquet: Parquet error: Z-order failed while scanning data: ParquetError(General("AsyncChunkReader::get_bytes error: Generic MicrosoftAzure error: error decoding response body"))

What you expected to happen: That the Z-order completes.

How to reproduce it:

import os
os.environ["RUST_LOG"]="debug"

from deltalake import DeltaTable

blob_path = "az://<redacted path>"
storage_options = {"AZURE_STORAGE_ACCOUNT_NAME": "<redacted sa>", "AZURE_CONTAINER_NAME":'<redacted container>', 'use_azure_cli': 'true'}

dt = DeltaTable(blob_path, storage_options=storage_options)
dt.optimize.z_order(["StatusDateTime"])

More details:

abhiaagarwal commented 2 weeks ago

I'm reasonably confident the error is orignating from here, based on my read of various error messages:

https://github.com/delta-io/delta-rs/blob/f0416921a3814a33ea1b3796a2a1468f8c76ca3d/crates/core/src/operations/optimize.rs#L500-L511

Since it's run in a blocking context in the python side, I'm wondering if that's causing any weirdness (it shouldn't).

thomasfrederikhoeck commented 2 weeks ago

@abhiaagarwal I wish I could assist but my Rust knowledge is very limited. But let me know if I need to test something.