Closed inigohidalgo closed 1 month ago
@inigohidalgo feel free to open a PR to make a change to the docs, contributions are always welcome! :)
Sounds good. Is it safe to assume that all the backends listed here other than AWS support this by default?
I will probably just reword the write_deltalake
docstring as that is what initially tripped me up, and add a small note at the start of this page https://delta-io.github.io/delta-rs/usage/writing/writing-to-s3-with-locking-provider/
Is it safe to assume that all the backends listed here other than AWS support this by default?
Yup. Though IIRC, Minio, Cloudflare, and other S3-compatible stores will have the same issue, even though we actually could enable it for some of them.
@wjones127 Cloudflare R2 actually supports copy if not exists with custom headers, which we are able to pass through. But that's the only exception for an S3 implementation afaik
Cloudflare R2 actually supports copy if not exists with custom headers, which we are able to pass through
Ah cool. I know R2 and Minio support using custom headers, but didn't know we had already implemented the proper pass through for R2. Do we have support for Minio as well then?
Cloudflare R2 actually supports copy if not exists with custom headers, which we are able to pass through
Ah cool. I know R2 and Minio support using custom headers, but didn't know we had already implemented the proper pass through for R2. Do we have support for Minio as well then?
Hmm I didn't know minio supported custom headers, then I guess it should work as well (but can't confirm)
Currently the documentation has a section that details a locking mechanism is needed for S3 to enable concurrent writing.
https://delta-io.github.io/delta-rs/usage/writing/writing-to-s3-with-locking-provider/
There are various mentions of concurrency throughout the docs, but having read through the docs a while ago I was left with the impression that "concurrent writing is only supported on S3, and you need a DynamoDB locking mechanism" when, in reality, concurrent writing is supported by default on (at least one) backends #2069 without needing that locking provider.
This is probably just my own lack of understanding of the delta protocol, but I think it would be good to make this clearer in the documentation, that concurrency is supported by default, and only S3 needs the locking mechanism.