kbase / blobstore

Simple S3 backed file storage.
MIT License
0 stars 3 forks source link

WIP - DEVOPS-350 - Update to support partial downloads #122

Closed jsfillman closed 3 years ago

jsfillman commented 3 years ago

PR to begin discussion on how to implement something like range headers or part numbers to blobstore.

(See Slack thread for context)

This is a first look at the blobstore code, so forgive any seemingly naive questions. Some initial thoughts & observations:

Observations:

Files that may need editing:

Questions:


Some very quick search range vars, based on search results can be found in s3.go.

MrCreosote commented 3 years ago

How do we set a new header? Would it be something like req.Header.Set("x-amz-meta-Filename", p.filename)

Just use the Range field in GetObjectInput: https://docs.aws.amazon.com/sdk-for-go/api/service/s3/#GetObjectInput at this point: https://github.com/kbase/blobstore/blob/master/filestore/s3.go#L195

Or if you're talking about the Blobstore API here, we shouldn't expose S3 implementation details in the API. I'd just make the offset and length (or seek and length, or whatever) query params so an immutable url can be passed around that points to a specific chunk of file without having to also pass around headers.

What is the correct syntax for setting up a range request?

See the GetObjectInput docs, but the TL;DR is that it points here: https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35

MrCreosote commented 3 years ago

... and it looks like you've figured that stuff out already since I read the PR comment and answered the questions before looking at the code changes. Meh.

MrCreosote commented 3 years ago

Per @bio-boris 's request for a review: The approach outlined in the comments in the code is basically what I would've done. I've answered the questions in the PR other than

Do we need to return the range to GetFileOutput (interface.go), or is that only needed for fs.s3client.GetObject in s3.go

I don't think you can, because when you're downloading a file, which is the only time a range is needed, then the return is the file bytestream, not GFO.

@bio-boris is there anything else you wanted to see as part of a review? Until there's a bit more implementation I'm not sure what else to do. Happy to discuss how implementation should be done (and I have my opinions) but I didn't want to butt into Jason's business too much.

MrCreosote commented 3 years ago

I guess the first thing to do is to get travis to pass - looks like the auth client tests are hitting external websites as part of the test and the contents have changed, which is causing test failures.

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts when merging c005321b28864f5a89567d1e30a56308446bef27 into 325cb6ebf954993a712b5e1f6b90e972c796abad - view on LGTM.com

new alerts:

lgtm-com[bot] commented 3 years ago

This pull request introduces 2 alerts when merging e3ea788c065c87bbf293ca82584d1f6be3c8e4cc into 325cb6ebf954993a712b5e1f6b90e972c796abad - view on LGTM.com

new alerts:

MrCreosote commented 3 years ago

The work outlined here has been completed, so closing