livepeer / go-tools

Utility packages used across Livepeer Go repositories.
0 stars 2 forks source link

S3 custom driver doesn't support region parameter #21

Open iameli opened 1 year ago

iameli commented 1 year ago

I think we're hardcoding "ignored" for the "region" parameter for s3+http and s3+https URLs. This works for most custom S3 storage drivers, but we've encountered one at least that requires it to be set to something:

> ./livepeer-catalyst-uploader s3+https://xxx:yyy@s3.eu-central-2.wasabisys.com/bucket/testing.txt
Testing!
time="2023-02-09T21:45:19+01:00" level=fatal msg="AuthorizationHeaderMalformed: The authorization header is malformed; the region 'ignored' is wrong; expecting 'eu-central-2'\n\tstatus code: 400, request id: redacted, host id: redacted"

(I strongly suspect that Wasabi storage is using AWS S3 behind the scenes, but whatever.)

Solution 1: Figure out a way to represent the region in the URL somehow. The most obvious way would be a query string ./livepeer-catalyst-uploader s3+https://xxx:yyy@s3.eu-central-2.wasabisys.com/bucket/testing.txt?region=eu-central-2, but that would be a pain to implement with Mist's directory traversal. Also I kind of like that you can define one of these URLs and concatenate a suffix and get a valid URL.

Solution 2: It occurs to me that we could just teach our driver to be smart enough to fix this on its own... it is being presented with the error

AuthorizationHeaderMalformed: The authorization header is malformed; the region 'ignored' is wrong; expecting 'eu-central-2'

There's nothing stopping it from just turning around and retying the request with the correct region at that point. Adds one RTT of latency for the upload but we haven't even started streaming any data...

iameli commented 1 year ago

I tried implementing this but it turns out it's a pain with a streaming upload; I'd have to introduce data buffering such that I could back up the stream and retry after I get the first 400. Instead I'm thinking I'll do kind of a stupid hack; every region-requiring URL I've encountered so far has matched the exact same format, eg:

s3.us-east-2.amazonaws.com
s3.us-west-1.wasabisys.com

So... I'm inclined to just auto-populate the region from the second segment of the URL. Most S3-compatible servers ignore the region field anyway, so doing so is harmless to our existing stores (it's not like they require the string "ignored") and it'll just "magically" start working for Wasabi users.