jrochkind / faster_s3_url

Optimized generation of public and presigned AWS S3 GET URLs in ruby faster
MIT License
84 stars 7 forks source link

Feature request - Add `endpoint` option to `Builder`. #9

Open BenKanouse opened 1 week ago

BenKanouse commented 1 week ago

Currently the Builder's initializer accepts a host param which assumes https. I like to use minio for local development and it isn't ideal to run it with https.

The normal Aws::S3::Client gem accepts an :endpoint option to by-pass this issue.

Here is a section of the docs that explains this option:

:endpoint (String, URI::HTTPS, URI::HTTP) — Normally you should not configure the :endpoint option directly. This is normally constructed from the :region option. Configuring :endpoint is normally reserved for connecting to test or custom endpoints. The endpoint should be a URI formatted like:
'http://example.com'
'https://example.com'
'http://example.com:123'

https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/S3/Client.html

jrochkind commented 1 week ago

Makes sense, are you up to sending a PR with specs to fix? (Do you think it can be done backwards compatibly?) Thank you if so!

BenKanouse commented 1 week ago

I hope to send a PR soon.

jrochkind commented 1 week ago

Thank you!!

BenKanouse commented 1 week ago

I need to stop for today, but I thought I would leave a comment with my progress just in case you wanted to give me any feedback.

I cloned the repo and made this commit on my clone: https://github.com/BenKanouse/faster_s3_url/commit/f76883c84206c52455590381c4ad5dbcf5ec8209

The commit message summarizes the progress:

This is working for my use case but the logic is a bit confusing to support both endpoint and host options, since we need to fetch the host from the endpoint when the endpoint option is supplied. So there are 3 ways to get a host now.

  1. Supply a host
  2. Use the default Host
  3. Get the host from endpoint

The other thing that is confusing is that sometimes S3 seems to put the bucket in the host name and sometimes it is the first item in the path.

I have it working for my use case, but it might not be working for other edge cases.

jrochkind commented 1 week ago

Thanks!

I won't have time to look at this until next week! Do you want to just make a PR here with your work so far? That'd be a convenient way for me to look at it! You can mark it draft or whatever.

The other thing that is confusing is that sometimes S3 seems to put the bucket in the host name and sometimes it is the first item in the path.

Yes, I seem to remember writing some logic (copied from AWS SDK) for determining which was which, that's in there somewhere?

This sounds like it should be do-able!

since we need to fetch the host from the endpoint when the endpoint option is supplied.

Do you mean you need to make an API request to find the host, or just like parse it from the endpoint? API request to get it is a bit annoying, performance wise, and needing network access on boot or whatever!

BenKanouse commented 1 week ago

I created a pull request https://github.com/jrochkind/faster_s3_url/pull/10

I did find a few interesting things to point out.

  1. I made the judgement call of trying to abstract a class from the Builder class. I put it in a separate commit so it can be removed easily if you don't like it.
  2. Amazon's logic for formatting the url is pretty complicated and I suspect this gem doesn't want the complexity of supporting every use case: https://github.com/aws/aws-sdk-ruby/blob/version-3/gems/aws-sdk-s3/lib/aws-sdk-s3/endpoint_provider.rb#L30
  3. I don't think the current host param works as intended and I wonder if it would be better to remove the host option in favor of this endpoint option. I put more details into that issue here: https://github.com/jrochkind/faster_s3_url/issues/11

Thanks for all the feedback! I really enjoy the gem!