Open jamesls opened 9 years ago
Correct me if I am wrong - would this potentially help issue of ~190GB upload to s3 bucket in region: us standard via
aws s3 cp DATA.csv s3://BUCKET_NAME/data.csv ?
Creates about 900 parts. It gets to about 15 of 900 parts before failing with:
upload failed: ./DATA.csv to s3://BUCKET_NAME/data.csv HTTPSConnectionPool(host='BUCKET_NAME.s3.amazonaws.com', port=443): Max retries exceeded with url: /data.csv?partNumber=9&uploadId=CgfYBQnTUBVMCmrdy_uvMXOk0vqQcsBl570rE6LCC7aNzHO8wBtn_Y1A.gkP9A35VLpOruZXD6k9pPBIUNmXsQ-- (Caused by <class 'socket.error'>: [Errno 104] Connection reset by peer)
Thank you.
^ And includes well over 5 retries in --debug (I meant to include that).
Is there any progress or plan about the feature release?
For the case of large files, it seems from this line that if any part of an upload fails, the whole thing is cancelled:
https://github.com/aws/aws-cli/blob/develop/awscli/customizations/s3/tasks.py#L259
The problem here is that for an unreliable internet connection (e.g. fails every 10 minutes) and a large file, there is a very high chance that at least one part of a multipart upload is going to fail. This means that the whole upload gets cancelled, i.e. a very low chance of success.
Could these failed parts be re-queued instead of causing cancellation?
Also looking at the code, it seems there are only retries for downloads, not uploads - https://github.com/aws/aws-cli/blob/develop/awscli/customizations/s3/tasks.py. This means that despite the mulitpart upload feature, large files are very unlikely to succeed if there are issues with the network connection - if any part fails then the whole is cancelled.
I'd be willing to work on this. However, I'd need some guidance:
1) Uploading needs retry logic adding, as it currently has none. Should we just do what DownloadPartTask does (repeat in a loop), or something else? Should it default to the same number of attempts as DownloadPartTask?
2) Should there be separate configuration parameters for download retries/upload retries?
3) Should it be possible to configure infinite retries, and what value should be used for that?
@kyleknap I'm offering to work on this - if someone can answer my questions above, I can get going. There are two separate features I guess:
1) retries for uploading 2) configuration for number of uploads.
Do you want me to create a new issue for part 1) ?
Here are some responses to your previous question:
1) So for upload parts we actually do have retry logic, that lives in botocore: https://github.com/boto/botocore/blob/develop/botocore/retryhandler.py. This defaults to 5: https://github.com/boto/botocore/blob/develop/botocore/data/_retry.json#L48 For the download parts though, we have some more retry logic on top of botocore's retry logic, causing the retries to be potentially more than 5.
2) I think one configuration option would be best here. We see retries happen a lot for multipart copies.
3) No I do not think that infinite retries should be allowable. For uploads we already do exponential backoff, so the time waiting between retries will get unreasonably long and it should error out.
I think it should be fine to keep tracking this on this issue. No need for a new issue to be opened.
I think being able to hook into the botocore logic that I linked with a value that you can provide for max retries would be the best approach, and I believe that was what James was referring to when he first opened the issue.
Great, thanks so much for the response, hopefully I'll get time to look at this over the Christmas period.
Working out how to configure the max_attempts
value is proving quite difficult...
There is no documentation for how to do this kind of thing - https://botocore.readthedocs.org/en/latest/index.html - and I generally have the principle of "docs or it doesn't exist".
But digging deeper, here is the chain I followed:
create_checker_from_retry_config
https://github.com/boto/botocore/blob/develop/botocore/retryhandler.py#L98create_retry_handler
https://github.com/boto/botocore/blob/develop/botocore/retryhandler.py#L72ClientCreator._register_retries
https://github.com/boto/botocore/blob/develop/botocore/client.py#L103The config
for the retries is loaded from _retry.json
, via self._loader.load_data('_retry'). This doesn't seem to give any opportunity for passing in other config, except by additional configuration files (via botocore.loaders.Loader.search_paths
So I can't see any way to configure this programmatically, without changes to botocore.
In case anyone else is looking for a workaround, I've found that the sync
command for s3cmd works well.
@spookylukey Mind entering a botocore issue for us? Sounds like this would fit perfectly in the Config class: http://botocore.readthedocs.org/en/latest/reference/config.html
Good Morning!
We're closing this issue here on GitHub, as part of our migration to UserVoice for feature requests involving the AWS CLI.
This will let us get the most important features to you, by making it easier to search for and show support for the features you care the most about, without diluting the conversation with bug reports.
As a quick UserVoice primer (if not already familiar): after an idea is posted, people can vote on the ideas, and the product team will be responding directly to the most popular suggestions.
We’ve imported existing feature requests from GitHub - Search for this issue there!
And don't worry, this issue will still exist on GitHub for posterity's sake. As it’s a text-only import of the original post into UserVoice, we’ll still be keeping in mind the comments and discussion that already exist here on the GitHub issue.
GitHub will remain the channel for reporting bugs.
Once again, this issue can now be found by searching for the title on: https://aws.uservoice.com/forums/598381-aws-command-line-interface
-The AWS SDKs & Tools Team
This entry can specifically be found on UserVoice at: https://aws.uservoice.com/forums/598381-aws-command-line-interface/suggestions/33168364-add-ability-for-s3-commands-to-increase-retry-coun
Based on community feedback, we have decided to return feature requests to GitHub issues.
Hi, is there any movement on this? I have a spotty connection and literally are unable to download any file that’s larger than a few hundred MiB from S3.
We've seen several issues opened now where, due to a number of variables, the max number of attempts, which is currently 5, is too low. This can be due to a less reliable WAN link, the available resources on the machine running the commands not being sufficient, the parallelism for S3 transfers being too high, etc.
To help with this issue, we should provide some sort of mechanism that allows a user to bump up the retry count. The main use case would be when transferring either a large amount of files or large files. In these scenarios you're more willing to retry as many times as needed to get the request to succeed.
See:
https://github.com/aws/aws-cli/issues/1065