We found that AWS-SDK S3 API would fail when we try to write more than 5GB of data. It is a blocking us to do capacity testing for a larger FARGATE container.
As mentioned in the post, one of our options is to use multi-part upload developed by AWS to split a file into smaller chunks to form the single S3 file. We have discussed with AWS engineers and decided to develop multipart upload logic.
Technical Details
We are splitting a file into 300MB (314572800 bytes) ByteStreams. Then, using multipart upload to upload each part to construct the single S3 file.
The multipart upload consists of three steps.
creating the multipart upload instance via create_multipart_upload(). it would generate upload id.
sending bytestreams to S3 using upload_part()
completing multipart upload via complete_multipart_upload() after uploading all the parts
Ref
Below are the references we used to develop multipart functionality.
Summary:
Context
We found that AWS-SDK S3 API would fail when we try to write more than 5GB of data. It is a blocking us to do capacity testing for a larger FARGATE container.
As mentioned in the post, one of our options is to use multi-part upload developed by AWS to split a file into smaller chunks to form the single S3 file. We have discussed with AWS engineers and decided to develop multipart upload logic.
Technical Details
We are splitting a file into 300MB (314572800 bytes) ByteStreams. Then, using multipart upload to upload each part to construct the single S3 file.
The multipart upload consists of three steps.
create_multipart_upload()
. it would generate upload id.upload_part()
complete_multipart_upload()
after uploading all the partsRef
Below are the references we used to develop multipart functionality.
Differential Revision: D39534523