developmentseed / obstore

Simple, fast integration with Amazon S3, Google Cloud Storage, Azure Storage, and S3-compliant APIs like Cloudflare R2
https://developmentseed.org/obstore
MIT License
146 stars 3 forks source link

Support s3 requester pays #76

Open kylebarron opened 2 weeks ago

kylebarron commented 2 weeks ago
image

https://docs.aws.amazon.com/AmazonS3/latest/userguide/ObjectsinRequesterPaysBuckets.html

kylebarron commented 2 weeks ago

From discord:

Hi everyone, if anyone is familiar with the object_store implementation in arrow-rs I had a question; is there a way to pass in arbitrary AWS (specifically S3) options via HTTP headers? For example, I don't see implicit support for requester pays buckets, which is hinted at by passing in this header value pair: x-amz-request-payer : requester so I'm wondering if there's a way to add in our own options like that.

Not on a per-request basis, as it sort of breaks the abstraction of being any object store, but you could look at overriding the headers using ClientOptions when you build the AmazonS3 client

kylebarron commented 2 weeks ago

I guess we have to set this header manually in ClientOptions::with_default_headers

kylebarron commented 2 weeks ago

So we need to set this in ClientOptions. But right now the only supported client options are those that we can pass in as strings. I.e. whatever gets passed to ClientOptions::with_config. But ClientConfigKey doesn't have an option for default_headers. So it's not currently possible to pass this in.

Instead, we should make ClientOptions a Python class, with its own constructor (not just created from a dict on the Rust side). Then a user can manually construct a ClientOptions class, and then pass that into an S3Store Python class to ensure we're enabling requester pays.

Or we could just add a requester_pays option to the S3Store constructor methods. That would probably be simpler for users.

We should probably do both of these two options.

alukach commented 1 week ago

Instead, we should make ClientOptions a Python class, with its own constructor (not just created from a dict on the Rust side). Then a user can manually construct a ClientOptions class, and then pass that into an S3Store Python class to ensure we're enabling requester pays.

@kylebarron Any strong reason that you suggest a traditional Python class rather than a TypedDict? For something like an all-optional body of options, a TypedDict may make sense and fit in well with what we already do with things like GetOptions

kylebarron commented 1 week ago

It's already a TypedDict essentially. It's already an all-optional body of options. (and we'll continue to allow that) But not everything can be strictly modeled as a dict of strings. We'd need it to be a pyclass to support class methods.

Here with_default_headers is a classmethod on the Rust side.

For easiest maintenance, I think the dict options should be passed directly into the underlying object_store implementation, so to support any other options on the ClientOptions rust struct it would need to be a Python pyclass.