delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
2.14k stars 380 forks source link

S3 requests after the hyper s3 timeout connection duration fail over large table operations #2301

Closed liamphmurphy closed 2 months ago

liamphmurphy commented 5 months ago

Environment

Delta-rs version: python v0.16.0

Binding: Python

Environment:


Bug

What happened:

On three of our large Delta tables (out of 500, most of the others are a lot smaller) I've repeatedly been seeing errors related to an S3 timeout. I get two kinds of errors; one indicates an error reading data from a parquet file, the other is just a much more generic error. Here they are:

Failed to parse parquet: Parquet error: AsyncChunkReader::get_bytes error: Generic S3 error: Error after 10 retries in 174.881515313s, max_retries:10, retry_timeout:180s, source:error sending request for url (https://s3.us-east-1.amazonaws.com/...redacted...some-data-file.parquet): operation timed out

AND

ERROR - Error deleting events from s3a://path-to-delta-table: Failed to parse parquet: Parquet error: AsyncChunkReader::get_bytes error: Generic S3 error: request or response body error: operation timed out

This can happen in various contexts, but the prominent example in my case is needing to scan the entire Delta table for records matching a value and deleting those matches. I've logged out right before this delete op runs, and the time delta between that log and the error is usually around 5 minutes. This time delta is relevant.

The second one in particular comes up a lot. This is on a table that gets around 250,000 rows added every hour and is partitioned by date 2024-03-18 and hour 15 as example.

My initial thought was that this was due to a failure to download a larger file such as a checkpoint; but those are "only" 668 MB and it is certainly not unheard of, moving files using various AWS SDK's will gladly handle larger files.

Our current theory is that instead this timeout indicate that the hyper client makes a request, then the library does some in-mem operations (such as scanning a large delta table) that takes a long time, and then when delta-rs is ready to reach out to S3 again, it exceeds some kind of keep-alive timeout and fails. This is why that time delta of 5 minutes I mentioned earlier is important.

This seems related to this comment: https://github.com/delta-io/delta-rs/blob/785667e5d65e3b45cf27e0a74cf297479f42d307/crates/aws/src/storage.rs#L401-L406

And finally I think this sort of timeout is expected from the hyper client, per this (admittedly old) ticket but it explains this exact situation in another context: https://github.com/hyperium/hyper/issues/763

What you expected to happen:

Even on long running operations, such as a scan over the whole table for deletion, the operation would eventually succeed barring no other issues.

How to reproduce it:

IF I AM CORRECT: perform a long running Delta operation, such as a delete on a table with millions of rows and have it try to make a final reach out to S3

More details:

Note this is just a theory. I wanted to get this issue out there to see what others think, I'm going to dig in and see if I can figure out whether it's a issue with the library or something else.

ion-elgreco commented 5 months ago

@liamphmurphy I'm happy I wasn't the only one facing this. Was questioning our infra setup

liamphmurphy commented 5 months ago

@liamphmurphy I'm happy I wasn't the only one facing this. Was questioning our infra setup

Haha I'm also glad to know I'm not completely crazy!

This may be relevant: https://github.com/awslabs/aws-sdk-rust/issues/1079#issuecomment-1997990355

If this fix is related, I've noticed the hyper version delta-rs uses is quite old so it'll be a big version increase.

ion-elgreco commented 5 months ago

@liamphmurphy just to double check, do you also see the behavior on 0.15.3? That's what we use at the moment

ion-elgreco commented 5 months ago

The linked code you shared is actually not out yet in a Python release. It was recently refactored by @mightyshazam to the AWS sdk's.

liamphmurphy commented 5 months ago

@ion-elgreco Tested with 0.15.3 and yep I get that second more generic error still.

And ahh okay understood, guess that explains why changing that env var wasn't doing anything haha. I'll bring that down locally and see if that fixes it

ion-elgreco commented 5 months ago

@liamphmurphy we actually see it happen in 1.5minute already:

The above exception was caused by the following exception:
_internal.DeltaError: Generic DeltaTable error: External error: External error: Arrow error: External error: Parquet error: ParquetObjectReader::get_byte_ranges error: Generic S3 error: request or response body error: operation timed out
liamphmurphy commented 5 months ago

I did some debugging and found where this error occurs (at least for me)

It happens here: https://github.com/delta-io/delta-rs/blob/dadf2645712c0301d7fe1f055dee92c3b5393fc4/crates/core/src/kernel/snapshot/log_segment.rs#L316

The checkpoint_stream.next() call hangs for some time, and then eventually returns an error;

Err(Parquet { source: General("ParquetObjectReader::get_byte_ranges error: Generic S3 error: request or response body error: operation timed out") })

So it appears that it finds the right checkpoint file (which for me is around 800 MB large) and detects any unaccounted for logs made after that checkpoint, but then fails on scanning this stream.

liamphmurphy commented 5 months ago

And for some additional logs in case it's helpful (the checkpoint files print line is my own), here's the last snippet of RUST_LOG=debug lines leading to the error.

checkpoint files: [ObjectMeta { location: Path { raw: "_delta_log/00000000000000108548.checkpoint.parquet" }, last_modified: 2024-03-19T13:01:23Z, size: 953533417, e_tag: Some("\"tag\""), version: None }]
[2024-03-19T14:23:36Z DEBUG hyper::client::pool] reuse idle connection for ("https", s3.us-east-1.amazonaws.com)
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::io] flushed 1575 bytes
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::io] parsed 12 headers
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::conn] incoming body is content-length (8 bytes)
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-03-19T14:23:36Z DEBUG hyper::client::pool] pooling idle connection for ("https", s3.us-east-1.amazonaws.com)
[2024-03-19T14:23:36Z DEBUG hyper::client::pool] reuse idle connection for ("https", s3.us-east-1.amazonaws.com)
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::io] flushed 1575 bytes
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::io] parsed 12 headers
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::conn] incoming body is content-length (86293 bytes)
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-03-19T14:23:36Z DEBUG hyper::client::pool] pooling idle connection for ("https", s3.us-east-1.amazonaws.com)
[2024-03-19T14:23:36Z DEBUG hyper::client::pool] reuse idle connection for ("https", s3.us-east-1.amazonaws.com)
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::io] flushed 1567 bytes
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::io] parsed 12 headers
[2024-03-19T14:23:36Z DEBUG hyper::proto::h1::conn] incoming body is content-length (953409770 bytes)
[2024-03-19T14:23:58Z DEBUG hyper::client::client] client connection error: connection closed before message completed
ion-elgreco commented 5 months ago

@liamphmurphy very useful! so for you it's reading these huge checkpoints then

I think in our case I figured whats happenning it's sometimes our incoming data has 0 rows, which was causing the merge to scan the whole table, will create something separate for that but there is crossover in the connection issues

liamphmurphy commented 5 months ago

@ion-elgreco Yeah appears to be. I presume the reason why a stream is used here was to accommodate large checkpoints like this and to not have to load it all in memory, but then that hyper client connection closes before it can finish. 🤔

thomasfrederikhoeck commented 5 months ago

@ion-elgreco we seing a similar issue using Azure. When we run the following code it fails with _internal.DeltaError: Failed to parse parquet: Parquet error: Z-order failed while scanning data: ArrowError(ExternalError(General("ParquetObjectReader::get_byte_ranges error: Generic MicrosoftAzure error: request or response body error: operation timed out")), None)

import os
os.environ["RUST_LOG"]="debug"

from deltalake import DeltaTable

blob_path = "az://<redacted path>"
storage_options = {"AZURE_STORAGE_ACCOUNT_NAME": "<redacted sa>", "AZURE_CONTAINER_NAME":'<redacted container>', 'use_azure_cli': 'true'}

dt = DeltaTable(blob_path, storage_options=storage_options)
dt.optimize.z_order(["StatusDateTime"])

The table consists of 12 parquet of roughly 400mb each.

You can see the full error log here: https://gist.githubusercontent.com/thomasfrederikhoeck/8db799b3fe4fbc2414626a97da9249a9/raw/658e6cbcf090fba424e7eafae99eea30c4ae7231/txt

ion-elgreco commented 5 months ago

@thomasfrederikhoeck can you try this? I think default is quite low I believe

storage_options = {"timeout": "120s"}

thomasfrederikhoeck commented 5 months ago

@ion-elgreco I just ran it with "timeout": "120s". It ran for much longer (15mins +) but failed with following error. I noticed that after a few minutes the network throughput dropped to zero indicating that nothing was happening.

DeltaError: Failed to parse parquet: Parquet error: Z-order failed while scanning data: ArrowError(ExternalError(General("ParquetObjectReader::get_byte_ranges error: Generic MicrosoftAzure error: Error after 0 retries in 86.1024429s, max_retries:10, retry_timeout:180s, source:error sending request for url (https://<redacted sa>.blob.core.windows.net/<redacted path>/DWCreatedDate=2024-02-04%2001:17:58.000000/Year=2019/part-00001-560c8d94-ed4e-4047-b268-ec11ab58100e-c000.snappy.parquet): connection error: An existing connection was forcibly closed by the remote host. (os error 10054)")), None) The logs goes like this:

[2024-04-03T15:03:08Z DEBUG hyper::proto::h1::io] flushed 10960 bytes
[2024-04-03T15:03:08Z DEBUG datafusion_physical_plan::sorts::sort] Spilling sort data of ExternalSorter to disk whilst inserting
[2024-04-03T15:03:08Z DEBUG hyper::client::client] client connection error: connection error: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:03:17Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:03:37Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:03:38Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:03:55Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:03:56Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:03:56Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:03:56Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:03:56Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:21Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:04:21Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:04:21Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:21Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:22Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:04:22Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:04:22Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:22Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:39Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:04:39Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:04:39Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:39Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:39Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:04:39Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:04:39Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:39Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:56Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:04:56Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:04:56Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:04:56Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:10Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:05:10Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:10Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:10Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:10Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:05:10Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:10Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:10Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:10Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:05:10Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:10Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:11Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:05:21Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 
10054)
[2024-04-03T15:05:21Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] incoming body decode error: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:21Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T15:05:21Z DEBUG hyper::proto::h1::conn] error shutting down IO: An existing connection was forcibly closed by the remote host. (os error 10054)
[2024-04-03T15:05:23Z DEBUG hyper::client::client] client connection error: error shutting down connection: An existing connection was forcibly closed by the remote host. (os error 10054)
ion-elgreco commented 5 months ago

@liamphmurphy @thomasfrederikhoeck can you try this branch? https://github.com/delta-io/delta-rs/pull/2378

Likely not the issue but worthwhile to try

@thomasfrederikhoeck on your side can you see the logs of your object store, because it indicates in your error msg that it got closed from that side

thomasfrederikhoeck commented 5 months ago

@ion-elgreco just tried #2378 . I still get error with the following logs. This might not be related to something like https://github.com/apache/arrow-rs/issues/5366 and https://github.com/pola-rs/polars/issues/14384#issuecomment-1999172894?

With regards to the closing it only happen when I set the timeout to 120s. I'm not 100% how to get logs out from Azure Storage Account.


[2024-04-03T20:40:03Z DEBUG hyper::proto::h1::conn] incoming body is content-length (948749 bytes)
[2024-04-03T20:40:03Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:03Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:03Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:03Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:03Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:03Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:03Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:03Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:03Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:03Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:04Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:04Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:04Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:04Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:05Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:05Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:06Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:06Z DEBUG hyper::proto::h1::io] flushed 10958 bytes
[2024-04-03T20:40:07Z DEBUG hyper::proto::h1::io] parsed 21 headers
[2024-04-03T20:40:07Z DEBUG hyper::proto::h1::conn] incoming body is content-length (883496 bytes)
[2024-04-03T20:40:08Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:08Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:09Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:09Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:10Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:11Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:12Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:13Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:14Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:15Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:18Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:19Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:19Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:19Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:21Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:22Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:22Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:23Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:24Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:26Z DEBUG hyper::client::pool] reuse idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:27Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:28Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:29Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:30Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:34Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:34Z DEBUG hyper::client::pool] pooling idle connection for ("https", <redacted sa>.blob.core.windows.net)
[2024-04-03T20:40:34Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:34Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:35Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:35Z DEBUG hyper::proto::h1::conn] incoming body completed
[2024-04-03T20:40:36Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:38Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:40Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:40Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2024-04-03T20:40:52Z DEBUG reqwest::connect] starting new connection: https://<redacted sa>.blob.core.windows.net/
[2024-04-03T20:40:52Z INFO  object_store::client::retry] Encountered transport error backing off for 0.1 seconds, retry 1 of 10: error sending request for url (https://<redacted sa>.blob.core.windows.net/<redacted path>/DWCreatedDate=2024-02-04%2001:17:58.000000/Year=2013/part-00001-de222443-0674-41b8-bdb4-53050a65fa47-c000.snappy.parquet): error trying to connect: dns error: task 1748 was cancelled
Traceback (most recent call last):
  File "c:\projects\ion-elgreco\delta-rs\test.py", line 14, in <module>
    dt.optimize.z_order(["StatusDateTime"])
  File "C:\projects\ion-elgreco\delta-rs\python\deltalake\table.py", line 2022, in z_order
    metrics = self.table._table.z_order_optimize(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_internal.DeltaError: Failed to parse parquet: Parquet error: Z-order failed while scanning data: ArrowError(ExternalError(General("ParquetObjectReader::get_byte_ranges error: Generic MicrosoftAzure error: request or response body error: operation timed out")), None)
ion-elgreco commented 5 months ago

@thomasfrederikhoeck @liamphmurphy can you guys please try out this PR: https://github.com/delta-io/delta-rs/pull/2392

thomasfrederikhoeck commented 5 months ago

@ion-elgreco it did not solve it. :-/ I see the following 5 times:

[2024-04-07T19:26:24Z INFO object_store::client::retry] Encountered transport error backing off for 0.1 seconds, retry 1 of 10: error sending request for url (https://<redacted sa>.blob.core.windows.net/<redacted path>/DWCreatedDate=2024-02-04%2001:17:58.000000/Year=2013/part-00002-de222443-0674-41b8-bdb4-53050a65fa47-c000.snappy.parquet): operation timed out

ion-elgreco commented 5 months ago

@ion-elgreco it did not solve it. :-/ I see the following 5 times:

[2024-04-07T19:26:24Z INFO object_store::client::retry] Encountered transport error backing off for 0.1 seconds, retry 1 of 10: error sending request for url (https://<redacted sa>.blob.core.windows.net/<redacted path>/DWCreatedDate=2024-02-04%2001:17:58.000000/Year=2013/part-00002-de222443-0674-41b8-bdb4-53050a65fa47-c000.snappy.parquet): operation timed out

How long is your timeout set to?

thomasfrederikhoeck commented 5 months ago

@ion-elgreco That is without timeout explicitly set. If I set it to timeout 120s it is the same just it takes longer time. I notice a pattern where CPU and RAM are used almost fully until I start seeing [2024-04-08T07:10:44Z DEBUG rustls::common_state] Sending warning alert CloseNotify which then results in ram starting to being released and CPU load lowering. This happens roughly 5 min after the script started. After I see this log there is no network activity. Do you have an idea of where we can add more debug logging to try and understand why and where this happens? :-)

ion-elgreco commented 5 months ago

@thomasfrederikhoeck I am not really sure what the issue could, because your previous errors showed closed by remote host, so that sounds more like a server-side kill of the connection

thomasfrederikhoeck commented 5 months ago

@ion-elgreco if I use the default timeout I don't see the error that the remote closed but in both cases I see object_store reporting operation timed out.

I just tried from an Azure VM to maximize network throughput and that was succesfull. Is there some way to limit the concurrency to test if it is because I'm overloading my network when I try from my laptop? I used to have similar issues in polars until this was merged https://github.com/pola-rs/polars/pull/15083 (see https://github.com/pola-rs/polars/issues/14384#issuecomment-1945621765 and https://github.com/pola-rs/polars/issues/14384#issuecomment-1998628622)

I'm not familiar with how polars has opted to hook up tokio, but it is important to keep CPU-bound tasks such as decoding parquet off the threadpool used for performing IO.

ion-elgreco commented 5 months ago

@thomasfrederikhoeck do you mind checking if you see the timeout behavior with 0.15.1?

In 0.15.2 the log replay got introduced this new replay stream concept using futures, it might be the cause.

Ah I see now it happens in Optimize for you, which makes sense I guess. The recordBatchStream is created using a futures::stream.

Let me see if I can stretch my knowledge on async rust here..

thomasfrederikhoeck commented 5 months ago

@ion-elgreco I tried 0.15.1 but that unfurtunally didn't help.

I wish I could assist but my knowledge on rust is very limited - async or not :-D Let me know if there is anything I can test out.

I want to also metion that when running the script above (https://github.com/delta-io/delta-rs/issues/2301#issuecomment-2034676561) i sometimes see the following inside the logs:

thread 'tokio-runtime-worker' panicked at C:\Users\runneradmin\.cargo\registry\src\index.crates.io-6f17d22bba15001f\bytes-1.5.0\src\bytes.rs:261:9:
range end out of bounds: 38957 <= 32012
stack backtrace:
   0:     0x7ffa3ae120ba - BrotliDecoderVersion
   1:     0x7ffa3ae39f7b - BrotliDecoderVersion
   2:     0x7ffa3ae0d271 - BrotliDecoderVersion
   3:     0x7ffa3ae11e3a - BrotliDecoderVersion
   4:     0x7ffa3ae147da - BrotliDecoderVersion
   5:     0x7ffa3ae14448 - BrotliDecoderVersion
   6:     0x7ffa3ae14e88 - BrotliDecoderVersion
   7:     0x7ffa3ae14d6d - BrotliDecoderVersion
   8:     0x7ffa3ae12d79 - BrotliDecoderVersion
   9:     0x7ffa3ae14a80 - BrotliDecoderVersion
  10:     0x7ffa3af78617 - BrotliDecoderVersion
  11:     0x7ffa382b4d0b - PyInit__internal
  12:     0x7ffa382e4779 - PyInit__internal
  13:     0x7ffa382c07e7 - PyInit__internal
  14:     0x7ffa382aefa9 - PyInit__internal
  15:     0x7ffa3a06bffd - BrotliDecoderVersion
  16:     0x7ffa382e76a8 - PyInit__internal
  17:     0x7ffa39fdc91f - BrotliDecoderVersion
  18:     0x7ffa39d2f2e5 - bz_internal_error
  19:     0x7ffa38fb479c - PyInit__internal
  20:     0x7ffa38fb5755 - PyInit__internal
  21:     0x7ffa38fb01cf - PyInit__internal
  22:     0x7ffa38d04962 - PyInit__internal
  23:     0x7ffa38fc0e4f - PyInit__internal
  24:     0x7ffa393490c9 - bz_internal_error
  25:     0x7ffa393ad49c - bz_internal_error
  26:     0x7ffa393e4aee - bz_internal_error
  27:     0x7ffa394413ed - bz_internal_error
  28:     0x7ffa3930240a - bz_internal_error
  29:     0x7ffa392fcf09 - bz_internal_error
  30:     0x7ffa392936e9 - bz_internal_error
  31:     0x7ffa39432fbb - bz_internal_error
  32:     0x7ffa3ad9d1b0 - BrotliDecoderVersion
  33:     0x7ffa3ad9b0be - BrotliDecoderVersion
  34:     0x7ffa3ad8f84e - BrotliDecoderVersion
  35:     0x7ffa3ad9a8c3 - BrotliDecoderVersion
  36:     0x7ffa3ad818fe - BrotliDecoderVersion
  37:     0x7ffa3ada2dc0 - BrotliDecoderVersion
  38:     0x7ffa3ad8ca2c - BrotliDecoderVersion
  39:     0x7ffa3ad84f4e - BrotliDecoderVersion
  40:     0x7ffa3ad85d58 - BrotliDecoderVersion
  41:     0x7ffa3ae2312c - BrotliDecoderVersion
  42:     0x7ffb3de07344 - BaseThreadInitThunk
  43:     0x7ffb3e6026b1 - RtlUserThreadStart
thomasfrederikhoeck commented 4 months ago

@ion-elgreco I can maybe hire a consultant to look at this if you would have the time to look at a PR given I can't really contribute myself.

ion-elgreco commented 4 months ago

@thomasfrederikhoeck for sure, I am happy to free up time to review an PR :)

thomasfrederikhoeck commented 4 months ago

@ion-elgreco I have someone working on it now but given that object_store is changing around the asyncwrite do you think it make sense to continue? Won't it maybe be better to wait for https://github.com/delta-io/delta-rs/issues/2436 since it looks like a new relases of the arrow crate might not be to far away: https://github.com/apache/arrow-rs/issues/5688

For reference: https://github.com/apache/arrow-rs/blob/cd3331989d65f6d56830f9ffa758b4c96d10f4be/object_store/CHANGELOG.md

ion-elgreco commented 4 months ago

@ion-elgreco I have someone working on it now but given that object_store is changing around the asyncwrite do you think it make sense to continue? Won't it maybe be better to wait for https://github.com/delta-io/delta-rs/issues/2436 since it looks like a new relases of the arrow crate might not be to far away: https://github.com/apache/arrow-rs/issues/5688

For reference: https://github.com/apache/arrow-rs/blob/cd3331989d65f6d56830f9ffa758b4c96d10f4be/object_store/CHANGELOG.md

Yes it's better to first upgrade the object store to using the upload trait and see how much of an impact that is. Perhaps that person can help with refactoring to those APIs.

We are however stuck because we can't upgrade beyond Datafusion 37, see here: https://github.com/delta-io/delta-rs/pull/2395 , https://github.com/apache/datafusion/issues/10181

abhiaagarwal commented 2 months ago

@thomasfrederikhoeck can you try #2581, which bumps the object_store dep? I refactored the object store wrapper on the python side to use an internal buffer with the Upload trait that occasionally flushes, it may solve this issue.

thomasfrederikhoeck commented 2 months ago

@abhiaagarwal Nice! I will test today or tomorrow:-)

abhiaagarwal commented 2 months ago

@thomasfrederikhoeck it's merged, try main :)

ion-elgreco commented 2 months ago

@abhiaagarwal it's actually already released 😉

thomasfrederikhoeck commented 2 months ago

@abhiaagarwal @ion-elgreco It looks like it has solved it (not getting any timeouts). NICE! However instead I got the following:

_internal.DeltaError: Failed to parse parquet: Parquet error: Z-order failed while scanning data: ParquetError(General("AsyncChunkReader::get_bytes error: Generic MicrosoftAzure error: error decoding response body"))

I will create a new issue :-)

Edit: #2592

abhiaagarwal commented 2 months ago

Hmm, @liamphmurphy can you check if this has been solved on the AWS side? The error you're getting @thomasfrederikhoeck is the exact same one that was reported above, the timeouts seem to be solved.

Digging a bit deeper, the error is orginating from here, where it's performing a HTTP range request to get the parquet metadata without downloading the whole file (DuckDB does something similar). I wonder if the error still lies with object_store's handling of HTTP range requests?

ion-elgreco commented 2 months ago

@abhiaagarwal you didn't update the Rust writer IIRC?

abhiaagarwal commented 2 months ago

@ion-elgreco yep, but the failure has to do with a read. Perhaps there's a path where it reads the parquet metadata before it performs the delete operation?

thomasfrederikhoeck commented 2 months ago

Hmm, @liamphmurphy can you check if this has been solved on the AWS side? The error you're getting @thomasfrederikhoeck is the exact same one that was reported above, the timeouts seem to be solved.

Digging a bit deeper, the error is orginating from here, where it's performing a HTTP range request to get the parquet metadata without downloading the whole file (DuckDB does something similar). I wonder if the error still lies with object_store's handling of HTTP range requests?

@abhiaagarwal the new error I'm getting now is not the one reported above? Previously, I was getting the timouts as reported above but your PR a have fixed this (I tried few times and didn't see it a single time). So I belive #2592 is new :-)

ion-elgreco commented 2 months ago

@ion-elgreco yep, but the failure has to do with a read. Perhaps there's a path where it reads the parquet metadata before it performs the delete operation?

It might be related to reading the checkpoint and we also often scan the parquet files

abhiaagarwal commented 2 months ago

@thomasfrederikhoeck yeah, I'm just curious as to see whether the same error is observed on the AWS side — the error here seems to be "the internal parquet reader can't decode the metadata returned by the http range request" (where it previously would time out even trying to even execute the range request), which could point to an implementation flaw in the upstream arrow dependency or maybe the object store implementation of azure.

kszlim commented 3 weeks ago

I'm curious if there's a PR that closed this? Was it the object store bump?