JuliaServices / CloudStore.jl

A simple, consistent, and performant API for interacting with common cloud storage abstractions
Other
16 stars 8 forks source link

Better support for 0-byte object getting #17

Closed quinnj closed 2 years ago

quinnj commented 2 years ago

Ok, quite the rabbit hole here, but I got to explore an odd dark corner of the internet in the process. Here goes: so our strategy for downloading/getting was this:

Indeed, it turns out the RFC states pretty clearly that if the requested object is empty, then a 416 should be returned for Range requests. It does, however, say you can make a Range request like Range: bytes=-1 with any non-zero number and that's ok (again, what??).

Alright, so we can't just dive into doing Range-ed GET requests.

So this PR implements the following strategy:

So net-net, for 0-byte objects, this is no overhead since we're still just doing 1 total request. For > 0 byte objects, we're doing 1 extra HEAD request, which for large, multipart downloads shouldn't even be noticeable. For smaller objects, this isn't ideal because we're essentially doubling the # of requests, but I'm not sure we can do any better in this generic setting. For now, if you know your objects are non-zero length, and you don't need multipart downloading, you can pass allowMultipart=false and we won't do the extra HEAD request. That seems good enough for now.