dhmlops / mlops

1 stars 0 forks source link

Re-upload to ECS S3 #1

Closed jax79sg closed 3 years ago

jax79sg commented 3 years ago

can i check if anyone face the problem of re-uploading? Context: when my model yields a better val acc, i will upload and replace model_best.pth.tar into the s3 output folder. First upload works fine but subsequent replacement will throw an error of

An error occurred (InternalError) when calling the CreateMultipartUpload operation (reached max retries: 4): We encountered an internal error. Please try again.
mantaphytoplankton commented 3 years ago

Please PM me the bucket that is having issue. Was the issue there previously or first time encountering with the same code?

Did an initial check. ECS is up. Also was able to generally upload and overwrite a file to a bucket in ECS using S3 Browser.

Need to narrow down if the issue has to do with API or bucket access rights. Are you able to run code snippet to upload and overwrite a file to the same bucket? If it doesn't work, I will take a look at the access rights, and we can also test on another bucket.

jinmingteo commented 3 years ago

thanks @jax79sg again for the help;

The fix is to "disable" multi part upload by setting an incredible high threshold (e.g 20GB) to activate multipart uploading

GB = 1024 ** 3
self.config = TransferConfig(multipart_threshold=20*GB)

For more info, can refer to my Pytorch Image Model