Open craigds opened 1 month ago
to be clear i'm happy to submit a PR to implement this change :) just wanted to solicit some feedback on the ideas first
Thanks for opening this, people also pay for these requests so best to minimize.
I strongly want to avoid adding settings where possible.
For option 1, would we still get an exception if you try to read a file that doesn't exist? As long as we maintain that invariant I think that is certainly the best way.
Am happy to accept a PR for this!
We've noticed that using
S3Storage.open("file.x").read()
does a lot of HEAD requests in addition to the GET:These are caused by:
When called in a tight loop these extra requests can slow things down a fair bit, especially for large numbers of small files.
I propose:
self.file
(thus triggering thedownload_fileobj
right away.). Probably most callers will be calling.read()
immediately anyway. Add a config option (EAGER_DOWNLOAD
?) to opt out if you really don't want to, but I don't see any common reason you wouldn't - If you don't want to read the file but just want object size or something, you don't need to callS3Storage.open()
at all, you can useS3Storage.size()
download_fileobj
by usingget()
instead ofdownload_fileobj
. This will probably be context-dependent (for larger files,download_fileobj
may perform better), so it probably needs to be opt-in via a setting - what aboutUSE_MULTIPART_DOWNLOAD
?Thanks for your consideration :)