seung-lab / cloud-files

Threaded Python and CLI client library for AWS S3, Google Cloud Storage (GCS), in-memory, and the local filesystem.
BSD 3-Clause "New" or "Revised" License
39 stars 8 forks source link

Enable S3 Upload of >5GB files #38

Open william-silversmith opened 3 years ago

william-silversmith commented 3 years ago

e.g. S3 won't allow upload of large files without performing multi-part upload.

Related to #7

sunnysidesounds commented 2 years ago

Kind of an old issue. But we are in need of this feature. I've been digging around in the cloud-file code a little. How hard do you think this would be to implement? Maybe I (or my team) could contribute to this functionality? Thoughts?

william-silversmith commented 2 years ago

If you have the time and energy, I think this would be a great addition. It's not particularly difficult, but I never got around to it. Here's some helpful info: https://medium.com/analytics-vidhya/aws-s3-multipart-upload-download-using-boto3-python-sdk-2dedb0945f11

The CloudFiles interface would look something like:

cf.put_multipart(...) and later if we're clever, we can make it so that it gets called inside of cf.put if the file is too big.

sunnysidesounds commented 2 years ago

Awesome, let me take a look. I appreciate the quick response 👍

william-silversmith commented 1 year ago

This might be solved in #85 though 5GB files were not tested.