Open kylebarron opened 2 weeks ago
Hi everyone, if anyone is familiar with the object_store implementation in arrow-rs I had a question; is there a way to pass in arbitrary AWS (specifically S3) options via HTTP headers? For example, I don't see implicit support for requester pays buckets, which is hinted at by passing in this header value pair:
x-amz-request-payer : requester
so I'm wondering if there's a way to add in our own options like that.Not on a per-request basis, as it sort of breaks the abstraction of being any object store, but you could look at overriding the headers using ClientOptions when you build the AmazonS3 client
I guess we have to set this header manually in ClientOptions::with_default_headers
So we need to set this in ClientOptions
. But right now the only supported client options are those that we can pass in as strings. I.e. whatever gets passed to ClientOptions::with_config
. But ClientConfigKey doesn't have an option for default_headers
. So it's not currently possible to pass this in.
Instead, we should make ClientOptions
a Python class, with its own constructor (not just created from a dict
on the Rust side). Then a user can manually construct a ClientOptions
class, and then pass that into an S3Store
Python class to ensure we're enabling requester pays.
Or we could just add a requester_pays
option to the S3Store constructor methods. That would probably be simpler for users.
We should probably do both of these two options.
Instead, we should make ClientOptions a Python class, with its own constructor (not just created from a dict on the Rust side). Then a user can manually construct a ClientOptions class, and then pass that into an S3Store Python class to ensure we're enabling requester pays.
@kylebarron Any strong reason that you suggest a traditional Python class rather than a TypedDict? For something like an all-optional body of options, a TypedDict may make sense and fit in well with what we already do with things like GetOptions
It's already a TypedDict
essentially. It's already an all-optional body of options. (and we'll continue to allow that) But not everything can be strictly modeled as a dict of strings. We'd need it to be a pyclass to support class methods.
Here with_default_headers
is a classmethod on the Rust side.
For easiest maintenance, I think the dict options should be passed directly into the underlying object_store implementation, so to support any other options on the ClientOptions
rust struct it would need to be a Python pyclass.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/ObjectsinRequesterPaysBuckets.html