livepeer / go-tools

Utility packages used across Livepeer Go repositories.
0 stars 2 forks source link

drivers: implement s3 region mismatch retry #22

Closed iameli closed 1 year ago

iameli commented 1 year ago

Start of a fix for https://github.com/livepeer/go-tools/issues/21.

This error is kind of funny.

> Let me upload a file to your region, whatever it is! < No! I'm us-west-1! You can only upload files to that region. > Fine. Let me upload a file to us-west-1! < Okay.

iameli commented 1 year ago

@victorges What's tough about using a query string is it makes the thing unusable as a prefix. Presently Mist can concatenate s3+https://user:password@example.com/bucket with /index.m3u8 and /1.ts and we don't have a problem. s3+https://user:password@example.com/bucket?region=us-west-1 is harder; not impossible but trickier.

victorges commented 1 year ago

@iameli makes sense, it does break compatibility with code that treats it as a file path not a URL.

Would probably need to bring it back in the path then, which is a breaking change unfortunately. So your fix is good enough for now I guess.

If that will be the official flow though (getting an error and then fixing the region) WDYT of doing it from the driver constructor somehow? I.e. after creating the session object, call some dummy metadata API just to get the error for the correct region and use that instead.

Not a strong suggestion though, especially cause in the end catalyst will recreate both the driver and the session for every little file that we access haha

iameli commented 1 year ago

Okay after testing it turns out this doesn't work at all - the s3 library we're using consumes some of the input stream before retrying so the successful retry is missing the first part of the multipart segment

iameli commented 1 year ago

Closing, I think the right implementation should be something like what I posted here:

I tried implementing this but it turns out it's a pain with a streaming upload; I'd have to introduce data buffering such that I could back up the stream and retry after I get the first 400. Instead I'm thinking I'll do kind of a stupid hack; every region-requiring URL I've encountered so far has matched the exact same format, eg:

s3.us-east-2.amazonaws.com
s3.us-west-1.wasabisys.com

So... I'm inclined to just auto-populate the region from the second segment of the URL. Most S3-compatible servers ignore the region field anyway, so doing so is harmless to our existing stores (it's not like they require the string "ignored") and it'll just "magically" start working for Wasabi users.