ahupp / python-magic

A python wrapper for libmagic
Other
2.6k stars 280 forks source link

s3 object byte range support #256

Closed tooptoop4 closed 1 year ago

tooptoop4 commented 2 years ago

boto3 library has s3 get_object that accepts a byte range (to avoid downloading a whole file but just download selected byte range). if my file is 100mb, can this library do some byte range seek to only download the s3 object partially (ie the first 1000 bytes) and then determine file type?

ahupp commented 2 years ago

I'd suggest reading the first chunk of the file (2k or so) and passing that to magic.from_buffer. The underlying libmagic library does have an interface that takes a file descriptor, but I'm not sure how to wire that up to an arbitrary python object.