dmolesUC / cos

A tool for testing and validating cloud object storage
MIT License
2 stars 1 forks source link

crvd fails with TotalPartsExceeded on very large S3 objects #15

Open dmolesUC opened 5 years ago

dmolesUC commented 5 years ago

Steps to reproduce:

Execute the command below:

cos crvd -v s3://<BUCKET>/ --endpoint https://s3.us-west-2.amazonaws.com/ --size 1T

Expected:

Upload completes in about 4 hours (assuming a fast client network)

Actual

Upload fails at 50G with:

Error: MultipartUpload: upload multipart failed
    upload id: iuCCRxEvNJGjfH_ZoA30A1JyUqUxrZkyGJpymOIb8VWq1Yb.ysFPqMaGZQgvRDR4PjFdgsoKn8TvH.ZfoKTdyRMTkx442X9_gD8N5oMiHyLtoVSlPm_nfxY1o3Km.42le_ZTxp_1ZYwpOoqb4SbWcQ--
caused by: TotalPartsExceeded: exceeded total allowed configured MaxUploadParts (10000). Adjust PartSize to fit in this limit
dmolesUC commented 5 years ago

I put in a fix to set PartSize to make sure MaxUploadParts isn't exceeded. It looks, though, like the S3 SDK wants to keep the entire part in memory, which means the maximum object size we can upload is still limited by available RAM.

On a smallish EC2 instance with 2 GB RAM, it seems like a 128 MiB PartSize is OK, but 1 GiB leads to OOM errors after uploading a few gigabytes. (With the 128 MiB PartSize, resident memory peaks at about 1.6 GiB, according to top.)

At 128 MiB and a maximum of 10K parts, we'd be limited to about 1.22 TiB or 1.34 TB. It would be good to keep digging in the API and see what facilities there are for explicit multipart uploads.