storj / gateway-st

Single-tenant, S3-compatible server to interact with the Storj network
Apache License 2.0
71 stars 19 forks source link

GetObject doesn't respect the `partNumber` Request URI parameter #56

Open amwolff opened 2 years ago

amwolff commented 2 years ago

GetObjectNInfo needs adjustments to handle the partNumber parameter. This way, we will fully support GetObject action.

Reference: https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html#API_GetObject_RequestSyntax

wthorp commented 2 years ago

At the core we'd have to extend ObjectDownloadRequest / Endpoint.DownloadObject() to be capable of returning from only a specific part. This would guarantee that order limits are set correctly, etc.

It does seem that part numbers are currently being persisted. The SegmentPosition struct is serialized as uint64(pos.Part)<<32 | uint64(pos.Index) and saved in metabase DB's [segments].[position].

amwolff commented 2 years ago

Here are my findings from a quick test with the "real" S3:

  1. partNumber parameter can be specified even for non-multipart-uploaded objects, but then x-amz-mp-parts-count is not present in the response. This scheme aligns well with imposed by AWS S3 limits, i.e., the maximum part size is 5GB, and the largest object that can be uploaded in a single PUT is 5 GB (so objects like that will only have one part);
  2. If partNumber is specified for multipart-uploaded objects, then x-amz-mp-parts-count is present. I think this means that the client can effectively download an object aligned with how it was uploaded part-by-part.

In short, per GetObject request with this mighty parameter, we will need to gather the appropriate range for the part specified and the number of available parts—if any.

mniewrzal commented 2 years ago

Just an initial question. Is this our finding or is it's a real use case we have with one of external testsuites? I'm asking to understand the priority.

In regards to the solution, @wthorp is right and we would need to extend metainfo.DownloadObject to return the exact range for the specified part. That is the relatively easy part, usually, we have more trouble with defining how to expose this with uplink API. There I think we have 3 options:

If we need to have this fix sooner than later then we can also initially fix it only on uplink side. DownloadObject response at the moment should contain all necessary information to figure out part range in object, so we can use this range to download part without changing server-side. The main issue is that we will allocate bandwidth for the full object but that would be only a temporary solution.

If we agree that this is a high priority we can think to schedule time for it. As task fully for Metainfo team I don't think we will have the power to do this next sprint but if that would be cross-team effort I'm sure we can figure out something.

amwolff commented 2 years ago

Thanks so much for thorough response! Our test suite and external test suites we test against don't cover this case. This is our find at the moment. I don't think it's very high in priority (certainly good to have, though); this probably needs confirmation, but I believe it could be even scheduled for Q2 (for our compatibility-improvement milestone).

shaupt131 commented 2 years ago

Moved to backlog until S3 compatibility Improvement milestone work is underway.

wthorp commented 2 years ago

I had thought that this feature (beyond being part of the standard) would result in more optimized parallel downloads.

Most parallel download clients will ask for some arbitrary sized chunk. If these chunk sizes relatively but don't exactly align with segment sizes, we'd effectively require two requests to download each segment. However, this story is worse if downloaders grab small chunks; It could take arbitrarily many metainfo requests to download a segment.

Seeing as how gaining efficiency here would require changing how clients work (which frankly won't happen), this now seems low priority. I can see a cache of some sort being the more effective solution to this problem.