The destination object, when cp is done, would have only data from that byte range of the src object.
Value
For reasons not worth getting into here, my organization often wants to download specific chunks from an otherwise unwieldy S3 object. These objects may be 10-20 GBs, when we only want a specific 1 MB chunk in a known byte location inside that file.
S3's GetObject enables us do that by accepting a Range header, which translates to huge network load reductions for us, plus storage savings on the destination volume.
Here's an example of including a Range header using the official AWS CLI:
s5cmd does not appear to expose this header, which is the only thing keeping us from using s5cmd in production (in spite of its clear superiority over boto3 ☹️).
I'm happy to contribute, but I'm creating an Issue first in case someone knows of a good reason why s5cmd doesn't have this yet.
Requested feature
An optional flag in cp
--range
, which accepts a byte range string, just like the standard, if unpopular, HTTP GET request Range header.The destination object, when cp is done, would have only data from that byte range of the src object.
Value
For reasons not worth getting into here, my organization often wants to download specific chunks from an otherwise unwieldy S3 object. These objects may be 10-20 GBs, when we only want a specific 1 MB chunk in a known byte location inside that file.
S3's
GetObject
enables us do that by accepting aRange
header, which translates to huge network load reductions for us, plus storage savings on the destination volume.Here's an example of including a Range header using the official AWS CLI:
Or using boto3:
s5cmd does not appear to expose this header, which is the only thing keeping us from using s5cmd in production (in spite of its clear superiority over boto3 ☹️).
I'm happy to contribute, but I'm creating an Issue first in case someone knows of a good reason why s5cmd doesn't have this yet.