apache / arrow-rs

Official Rust implementation of Apache Arrow
https://arrow.apache.org/
Apache License 2.0
2.46k stars 734 forks source link

[object-store] Make aws region optional. #5211

Closed ritchie46 closed 8 months ago

ritchie46 commented 9 months ago

We get this request upstream: https://github.com/pola-rs/polars/issues/13042

Where we cause an error because we try to infer the region. However it turns out any region would suffice because they have a private s3.

To be able to tailor to that use case I assume it would be better to make region optional. We can hack around this by just setting a region for the time being.

tustvold commented 9 months ago

The region forms part of the sigv4 signature so we can't just ignore it, although many systems like cloudflare R2 accept a region of us-east-1 or auto, but I'm not sure if there is a safe default that will work everywhere...

https://github.com/minio/minio/discussions/15063

https://developers.cloudflare.com/r2/api/s3/api/#bucket-region

ritchie46 commented 9 months ago

The "auto" seems ok. :+1:

It is perfectly solvable by end-users of object-store. Maybe we we could inform them of this option in an error message.

Not sure here, as you said,It might not work for all end-points. :thinking:

Feel free to close. :)

Xuanwo commented 8 months ago

Not sure here, as you said,It might not work for all end-points. 🤔

Yep, it doesn't work for all endpoints. Even minio users could set their own regions like org-us-1. In severe instances, an incorrect region can lead to more perplexing errors that are significantly harder for users to resolve.

For example, accessing s3 from wrong region could got:

{ code: "PermanentRedirect", message: "The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.", resource: "", request_id: "MN9M36MVQ5AHKDQ2" }
tustvold commented 8 months ago

We intercept such redirects and return a more helpful error, I therefore think defaulting to us-east-1 is reasonable. If incorrect, we just defer the error, but if valid everything just works 🎉

Xuanwo commented 8 months ago

We intercept such redirects and return a more helpful error, I therefore think defaulting to us-east-1 is reasonable. If incorrect, we just defer the error, but if valid everything just works 🎉

Sound great!

tustvold commented 8 months ago

label_issue.py automatically added labels {'object-store'} from #5244