delta-io / delta-rs

A native Rust library for Delta Lake, with bindings into Python
https://delta-io.github.io/delta-rs/
Apache License 2.0
2.35k stars 414 forks source link

feat: override dynamodb config #3011

Closed thomas-chauvet closed 4 days ago

thomas-chauvet commented 6 days ago

Description

This PR add the possibility to override dynamoDB configuration via storage_options:

It solves cases where we use an S3-compatible storage without mutual exclusion but we want to use DynamoDB anyway as a locking provider. It also solves the case where we want to pass different credentials or region for S3 and dynamoDB.

Related Issue(s)

Documentation

Documentation has been updated to explain how to override configuration.

thomas-chauvet commented 5 days ago

@rtyler thanks for your quick review! I've just fixed a small equality in test. I've made a workaround before and I've left it in the code. It is now properly fixed. Thanks!

codecov[bot] commented 5 days ago

Codecov Report

Attention: Patch coverage is 86.61417% with 17 lines in your changes missing coverage. Please review.

Project coverage is 72.42%. Comparing base (0d076a7) to head (2d1b6f6). Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
crates/aws/src/lib.rs 87.37% 12 Missing and 1 partial :warning:
crates/aws/src/logstore/dynamodb_logstore.rs 0.00% 4 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #3011 +/- ## ========================================== - Coverage 72.43% 72.42% -0.02% ========================================== Files 128 128 Lines 40859 40974 +115 Branches 40859 40974 +115 ========================================== + Hits 29597 29675 +78 - Misses 9357 9374 +17 - Partials 1905 1925 +20 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.


🚨 Try these New Features:

ion-elgreco commented 5 days ago

@thomas-chauvet @rtyler This might be short lived, if we can get delta-rs bumped soon to the new objectstore release. I would like us to default using putIfAbsent by default for any S3 implementation, then it's up to users who use S3 which does not support this, to set copyIfNotExists or allow unsafe_rename

thomas-chauvet commented 5 days ago

Does this mean that locking provider will be definitely remove from delta-rs?

ion-elgreco commented 5 days ago

Does this mean that locking provider will be definitely remove from delta-rs?

Not per se, but the value of a locking provider is quite minimal unless you still use an S3 implementation like LakeFS that doesn't have these request headers yet

rtyler commented 5 days ago

@ion-elgreco I don't expect the DynamoDB log store code to go away for at least 3-6 months, even if we had the putIfAbsent support today, I would want to keep the support around because existing deployments using DynamoDB will not all be ready to migrate 100%.

I see value in adding these parameters until such a time when all DynamoDB code is removed from delta-rs, which currently has no scheduled time.

ion-elgreco commented 5 days ago

@ion-elgreco I don't expect the DynamoDB log store code to go away for at least 3-6 months, even if we had the putIfAbsent support today, I would want to keep the support around because existing deployments using DynamoDB will not all be ready to migrate 100%.

I see value in adding these parameters until such a time when all DynamoDB code is removed from delta-rs, which currently has no scheduled time.

For sure, we should keep it for some time!

thomas-chauvet commented 2 days ago

Thanks for the merge!