Open rogpeppe opened 1 year ago
Specifically, the
Descriptor.URLs
field seems potentially appropriate for returning information about a location from which a blob can be directly downloaded.
I do like this a lot, I feel like it puns on the intended purpose and client behavior very well.
Another thing to consider is that descriptors have annotations, so we could use that for arbitrary side-channel information in other contexts where urls might not be a perfect fit.
Currently all data read from and written to an
ociregistry.Interface
implementation flows directly through that implementation. This doesn't fit with the spirit of the OCI API which allows hand-off of reads and writes via the Location header and HTTP redirection to arbitrary other data providers.Reads
The read-related entry points return
BlobReader
values which have aDescriptor
method to return information about the blob. This is not directly reflected in the HTTP API, which only returns size, digest, and content-type information when reading. This means that the rest of the descriptor can be potentially be used byociregistry.Interface
implementations to return other metadata about the blob.Specifically, the
Descriptor.URLs
field seems potentially appropriate for returning information about a location from which a blob can be directly downloaded.Specifically the
ociclient
implementation could, when making aGetBlob
request, detect that there's an HTTP redirect and, instead of just following the redirect, return aBlobReader
that includes the redirect URL in its descriptor. When this happens, the client would not issue the actual GET request to the redirect location, but just make a HEAD request (following redirects for some steps if needed) to avoid initiating the data stream. The firstRead
call on the returnedBlobReader
would make the actual GET request to the final redirect location, thus fulfilling theGetBlob
contract for callers that don't look atDescriptor.URLs
. Alternatively the caller ofGetBlob
would be free to go direct to that URL, or (in the case ofociserver
) to return a redirect status code itself.ociregistry.Interface
wrapper implementations would be responsible for clearing out theURLs
field in descriptors returned from the wrapped registry implementation if not appropriate.Writes
The
PushBlobChunked
entry point can be made to work with direct uploads by adding aLocation
(or similarly named) method toBlobWriter
holding the location returned by the initial POST request. A caller would then be able to go directly to that location to exercise the PATCH/PUT part of the protocol instead of writing to theBlobWriter
. Theociserver
implementation (inhandleBlobStartUpload
specifically) would useBlobWriter.Location
to populate theLocation
header if non-empty, falling back to using the local relative location otherwise.The
PushBlob
entry point is currently problematic with respect to direct upload because the contract is that it will always push the blob using the data from the passed-in reader. That means that when we're talking to the HTTP client and the response is Accepted, (indicating a new location to upload to) we still end up pushing the data in this way. One way of working around this might be to specify thatPushBlob
can return some error indicating thatPushBlobChunked
must be used instead, mirroring the HTTP API.A downside of that approach is that it implies an extra round-trip, because we'll end up doing the POST request twice: once for
PushBlob
to see the initial error and the second time to initiate the upload viaPushBlobChunked
. We could include the upload ID in the error returned byPushBlob
, but we'd still have an extra round trip due to the GET request sent byPushBlobChunked
to recover the offset, although it might be possible to mitigate that in the common case by avoiding that call and using zero offset when we know that the upload ID has just been returned fromPushBlob
.Another downside of the approach is that it makes the caller API more complex to use: we can't just invoke
PushBlob
and expect that it will work. Instead, we'd probably implement a higher level function (maybeociregistry.PushBlob
) that knows how to usePushBlob
and fall back to usingPushBlobChunked
on failure.