delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
2.3k stars 404 forks source link

Utilize Amazon S3 condition write to support concurrent write #2843

Open Cpaulyz opened 2 months ago

Cpaulyz commented 2 months ago

Description

Hi, I noticed that Amazon S3 supports condition write now (https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-s3-conditional-writes/). Is it possible to utilize this feature to replace dynamodb-lock to support concurrent write natively?

thomasfrederikhoeck commented 1 month ago

I guess the following upstream needs to be closed first: https://github.com/apache/arrow-rs/issues/6285

danielgafni commented 1 month ago

Just to confirm, this won't allow parallel writing (for example, writing 100 partitions at once), but will remove the dependency on DynamoDB as locking mechanism, right?

Cpaulyz commented 3 weeks ago

Just to confirm, this won't allow parallel writing (for example, writing 100 partitions at once), but will remove the dependency on DynamoDB as locking mechanism, right?

Yes. I think so.

rtyler commented 3 weeks ago

If you're up for doing some experimentation (on non-production workloads), I believe that our conditional put support will "just work" for S3.