aws / aws-cli

Universal Command Line Interface for Amazon Web Services
Other
15.48k stars 4.11k forks source link

Getting timeouts or connection reset by peers when trying to run get-job-output from glacier using aws cli #5296

Closed samthar closed 4 years ago

samthar commented 4 years ago

Confirm by changing [ ] to [x] below:

Issue is about usage on:

Platform/OS/Hardware/Device What are you running the cli on? RHEL 7 aws-cli/2.0.22 Python/3.7.3 Linux/3.10.0-1127.10.1.el7.x86_64 botocore/2.0.0dev26 Describe the question Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer')

Logs/output Get full traceback and error logs by adding --debug to the command. aws --debug --cli-read-timeout 0 --cli-connect-timeout 0 glacier get-job-output --account-id '-' --vault-name backup_oracle --job-id "hTKTzfsexzyANgZKoZBQpEU8BnQz3LjiPyuSWjaHwKvDGv7U65c8XeN2MhGFzLu2Nc-mnHwes5WcxMrOcMKothp9KkaZ" /backup/image/gen121x_2020Jun14_171358 2020-06-16 14:55:08,489 - MainThread - awscli.clidriver - DEBUG - CLI version: aws-cli/2.0.22 Python/3.7.3 Linux/3.10.0-1127.10.1.el7.x86_64 botocore/2.0.0dev26 2020-06-16 14:55:08,489 - MainThread - awscli.clidriver - DEBUG - Arguments entered to CLI: ['--debug', '--cli-read-timeout', '0', '--cli-connect-timeout', '0', 'glacier', 'get-job-output', '--account-id', '-', '--vault-name', 'backup_oracle', '--job-id', 'hTKTzfsexzyANgZKoZBQpEU8BnQz3LjiPyuSWjaHwKvDGv7U65c8XeN2MhGFzLu2Nc-mnHwes5WcxMrOcMKothp9KkaZ', '/backup/image/gen121x_2020Jun14_171358'] 2020-06-16 14:55:08,490 - MainThread - botocore.hooks - DEBUG - Event session-initialized: calling handler <function add_timestamp_parser at 0x7f706bcdc9d8> 2020-06-16 14:55:08,490 - MainThread - botocore.hooks - DEBUG - Event session-initialized: calling handler <function register_uri_param_handler at 0x7f706c62d598> 2020-06-16 14:55:08,490 - MainThread - botocore.hooks - DEBUG - Event session-initialized: calling handler <function add_binary_formatter at 0x7f706bc8ac80> 2020-06-16 14:55:08,490 - MainThread - botocore.hooks - DEBUG - Event session-initialized: calling handler <function inject_assume_role_provider_cache at 0x7f706c5de620> 2020-06-16 14:55:08,492 - MainThread - botocore.hooks - DEBUG - Event session-initialized: calling handler <function attach_history_handler at 0x7f706be28c80> 2020-06-16 14:55:08,492 - MainThread - botocore.hooks - DEBUG - Event session-initialized: calling handler <function inject_json_file_cache at 0x7f706be59730> 2020-06-16 14:55:08,497 - MainThread - botocore.loaders - DEBUG - Loading JSON file: /apps/awscli/v2/2.0.22/dist/botocore/data/glacier/2012-06-01/service-2.json 2020-06-16 14:55:08,503 - MainThread - botocore.hooks - DEBUG - Event building-command-table.glacier: calling handler <function add_waiters at 0x7f706bce0e18> 2020-06-16 14:55:08,507 - MainThread - botocore.loaders - DEBUG - Loading JSON file: /apps/awscli/v2/2.0.22/dist/botocore/data/glacier/2012-06-01/waiters-2.json 2020-06-16 14:55:08,508 - MainThread - awscli.clidriver - DEBUG - OrderedDict([('account-id', <awscli.arguments.CLIArgument object at 0x7f706bb03320>), ('vault-name', <awscli.arguments.CLIArgument object at 0x7f706bb09780>), ('job-id', <awscli.arguments.CLIArgument object at 0x7f706bb09898>), ('range', <awscli.arguments.CLIArgument object at 0x7f706bb09860>)]) 2020-06-16 14:55:08,508 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.glacier.get-job-output: calling handler <function add_streaming_output_arg at 0x7f706bcdcc80> 2020-06-16 14:55:08,508 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.glacier.get-job-output: calling handler <function add_cli_input_json at 0x7f706c5dee18> 2020-06-16 14:55:08,509 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.glacier.get-job-output: calling handler <function add_cli_input_yaml at 0x7f706c5e5510> 2020-06-16 14:55:08,509 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.glacier.get-job-output: calling handler <function unify_paging_params at 0x7f706be5fe18> 2020-06-16 14:55:08,513 - MainThread - botocore.loaders - DEBUG - Loading JSON file: /apps/awscli/v2/2.0.22/dist/botocore/data/glacier/2012-06-01/paginators-1.json 2020-06-16 14:55:08,513 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.glacier.get-job-output: calling handler <function add_generate_skeleton at 0x7f706bd41598> 2020-06-16 14:55:08,513 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.glacier.get-job-output: calling handler <function add_auto_prompt at 0x7f706bc8abf8> 2020-06-16 14:55:08,514 - MainThread - botocore.hooks - DEBUG - Event load-cli-arg.glacier.get-job-output.account-id: calling handler <awscli.paramfile.URIArgumentHandler object at 0x7f706e13a4e0> 2020-06-16 14:55:08,514 - MainThread - botocore.hooks - DEBUG - Event process-cli-arg.glacier.get-job-output: calling handler <awscli.argprocess.ParamShorthandParser object at 0x7f706c628908> 2020-06-16 14:55:08,514 - MainThread - awscli.arguments - DEBUG - Unpacked value of '-' for parameter "account_id": '-' 2020-06-16 14:55:08,514 - MainThread - botocore.hooks - DEBUG - Event load-cli-arg.glacier.get-job-output.vault-name: calling handler <awscli.paramfile.URIArgumentHandler object at 0x7f706e13a4e0> 2020-06-16 14:55:08,515 - MainThread - botocore.hooks - DEBUG - Event process-cli-arg.glacier.get-job-output: calling handler <awscli.argprocess.ParamShorthandParser object at 0x7f706c628908> 2020-06-16 14:55:08,515 - MainThread - awscli.arguments - DEBUG - Unpacked value of 'backup_oracle' for parameter "vault_name": 'backup_oracle' 2020-06-16 14:55:08,515 - MainThread - botocore.hooks - DEBUG - Event load-cli-arg.glacier.get-job-output.job-id: calling handler <awscli.paramfile.URIArgumentHandler object at 0x7f706e13a4e0> 2020-06-16 14:55:08,515 - MainThread - botocore.hooks - DEBUG - Event process-cli-arg.glacier.get-job-output: calling handler <awscli.argprocess.ParamShorthandParser object at 0x7f706c628908> 2020-06-16 14:55:08,515 - MainThread - awscli.arguments - DEBUG - Unpacked value of 'hTKTzfsexzyANgZKoZBQpEU8BnQz3LjiPyuSWjaHwKvDGv7U65c8XeN2MhGFzLu2Nc-mnHwes5WcxMrOcMKothp9KkaZ' for parameter "job_id": 'hTKTzfsexzyANgZKoZBQpEU8BnQz3LjiPyuSWjaHwKvDGv7U65c8XeN2MhGFzLu2Nc-mnHwes5WcxMrOcMKothp9KkaZ' 2020-06-16 14:55:08,515 - MainThread - botocore.hooks - DEBUG - Event load-cli-arg.glacier.get-job-output.range: calling handler <awscli.paramfile.URIArgumentHandler object at 0x7f706e13a4e0> 2020-06-16 14:55:08,515 - MainThread - botocore.hooks - DEBUG - Event load-cli-arg.glacier.get-job-output.outfile: calling handler <awscli.paramfile.URIArgumentHandler object at 0x7f706e13a4e0> 2020-06-16 14:55:08,515 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: env 2020-06-16 14:55:08,515 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: assume-role 2020-06-16 14:55:08,515 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: assume-role-with-web-identity 2020-06-16 14:55:08,515 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: sso 2020-06-16 14:55:08,515 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: shared-credentials-file 2020-06-16 14:55:08,516 - MainThread - botocore.credentials - INFO - Found credentials in shared credentials file: ~/.aws/credentials 2020-06-16 14:55:08,516 - MainThread - botocore.loaders - DEBUG - Loading JSON file: /apps/awscli/v2/2.0.22/dist/botocore/data/endpoints.json 2020-06-16 14:55:08,522 - MainThread - botocore.hooks - DEBUG - Event choose-service-name: calling handler <function handle_service_name_alias at 0x7f706d5d8c80> 2020-06-16 14:55:08,523 - MainThread - botocore.hooks - DEBUG - Event creating-client-class.glacier: calling handler <function add_generate_presigned_url at 0x7f706d624ae8> 2020-06-16 14:55:08,526 - MainThread - botocore.endpoint - DEBUG - Setting glacier timeout as (None, None) 2020-06-16 14:55:08,527 - MainThread - botocore.hooks - DEBUG - Event provide-client-params.glacier.GetJobOutput: calling handler <function base64_decode_input_blobs at 0x7f706bca7510> 2020-06-16 14:55:08,527 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.glacier.GetJobOutput: calling handler <function inject_account_id at 0x7f706d5f7a60> 2020-06-16 14:55:08,527 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.glacier.GetJobOutput: calling handler <function generate_idempotent_uuid at 0x7f706d5f6b70> 2020-06-16 14:55:08,528 - MainThread - botocore.hooks - DEBUG - Event before-call.glacier.GetJobOutput: calling handler <function add_glacier_version at 0x7f706d5f7ae8> 2020-06-16 14:55:08,528 - MainThread - botocore.hooks - DEBUG - Event before-call.glacier.GetJobOutput: calling handler <function inject_api_version_header_if_needed at 0x7f706d5fa378> 2020-06-16 14:55:08,528 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=GetJobOutput) with params: {'url_path': '/-/vaults/backup_oracle/jobs/hTKTzfsexzyANgZKoZBQpEU8BnQz3LjiPyuSWjaHwKvDGv7U65c8XeN2MhGFzLu2Nc-mnHwes5WcxMrOcMKothp9KkaZ/output', 'query_string': {}, 'method': 'GET', 'headers': {'User-Agent': 'aws-cli/2.0.22 Python/3.7.3 Linux/3.10.0-1127.10.1.el7.x86_64 botocore/2.0.0dev26', 'x-amz-glacier-version': '2012-06-01'}, 'body': b'', 'url': 'https://glacier.ca-central-1.amazonaws.com/-/vaults/backup_oracle/jobs/hTKTzfsexzyANgZKoZBQpEU8BnQz3LjiPyuSWjaHwKvDGv7U65c8XeN2MhGFzLu2Nc-mnHwes5WcxMrOcMKothp9KkaZ/output', 'context': {'client_region': 'ca-central-1', 'client_config': <botocore.config.Config object at 0x7f706b9171d0>, 'has_streaming_input': False, 'auth_type': None}} 2020-06-16 14:55:08,528 - MainThread - botocore.hooks - DEBUG - Event request-created.glacier.GetJobOutput: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x7f706b917160>> 2020-06-16 14:55:08,528 - MainThread - botocore.hooks - DEBUG - Event choose-signer.glacier.GetJobOutput: calling handler <function set_operation_specific_signer at 0x7f706d5f6a60> 2020-06-16 14:55:08,528 - MainThread - botocore.auth - DEBUG - Calculating signature using v4 auth. 2020-06-16 14:55:08,528 - MainThread - botocore.auth - DEBUG - CanonicalRequest: GET /-/vaults/backup_oracle/jobs/hTKTzfsexzyANgZKoZBQpEU8BnQz3LjiPyuSWjaHwKvDGv7U65c8XeN2MhGFzLu2Nc-mnHwes5WcxMrOcMKothp9KkaZ/output

host:glacier.ca-central-1.amazonaws.com x-amz-date:20200616T205508Z x-amz-glacier-version:2012-06-01

host;x-amz-date;x-amz-glacier-version e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 2020-06-16 14:55:08,529 - MainThread - botocore.auth - DEBUG - StringToSign: AWS4-HMAC-SHA256 20200616T205508Z 20200616/ca-central-1/glacier/aws4_request 770843bec3e55cf2224d9e85930f9228fbefcdd2a29c14dd9abc24f1872eca78 2020-06-16 14:55:08,529 - MainThread - botocore.auth - DEBUG - Signature: 7f7551b1885012ebd451e2024cd3580a6f1e2ebe2b27695303345125343c6bbf 2020-06-16 14:55:08,529 - MainThread - botocore.endpoint - DEBUG - Sending http request: <AWSPreparedRequest stream_output=True, method=GET, url=https://glacier.ca-central-1.amazonaws.com/-/vaults/backup_oracle/jobs/hTKTzfsexzyANgZKoZBQpEU8BnQz3LjiPyuSWjaHwKvDGv7U65c8XeN2MhGFzLu2Nc-mnHwes5WcxMrOcMKothp9KkaZ/output, headers={'User-Agent': b'aws-cli/2.0.22 Python/3.7.3 Linux/3.10.0-1127.10.1.el7.x86_64 botocore/2.0.0dev26', 'x-amz-glacier-version': b'2012-06-01', 'X-Amz-Date': b'20200616T205508Z', 'Authorization': b'AWS4-HMAC-SHA256 Credential=AKIA4N344XS4ITFXBZFU/20200616/ca-central-1/glacier/aws4_request, SignedHeaders=host;x-amz-date;x-amz-glacier-version, Signature=7f7551b1885012ebd451e2024cd3580a6f1e2ebe2b27695303345125343c6bbf'}> 2020-06-16 14:55:08,529 - MainThread - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): glacier.ca-central-1.amazonaws.com:443 2020-06-16 14:55:09,054 - MainThread - urllib3.connectionpool - DEBUG - https://glacier.ca-central-1.amazonaws.com:443 "GET /-/vaults/backup_oracle/jobs/hTKTzfsexzyANgZKoZBQpEU8BnQz3LjiPyuSWjaHwKvDGv7U65c8XeN2MhGFzLu2Nc-mnHwes5WcxMrOcMKothp9KkaZ/output HTTP/1.1" 200 40549092709 2020-06-16 14:55:09,055 - MainThread - botocore.parsers - DEBUG - Response headers: {'x-amzn-RequestId': '3asCY_9y_AWbzKjIReV9HdZRL7UgvaTV8a3xtz57cQp6pyE', 'x-amz-sha256-tree-hash': 'f2f33dbddacf1bd48259421a5a1e1c22ed58220b1b865ed56fbc32b4b3344fb4', 'Accept-Ranges': 'bytes', 'x-amz-archive-description': '/backup/image/gen121x_2020Jun14_171358', 'Content-Type': 'application/octet-stream', 'Content-Length': '40549092709', 'Date': 'Tue, 16 Jun 2020 20:55:08 GMT'} 2020-06-16 14:55:09,055 - MainThread - botocore.parsers - DEBUG - Response body: <botocore.response.StreamingBody object at 0x7f706b87c5c0> 2020-06-16 14:55:09,055 - MainThread - botocore.hooks - DEBUG - Event needs-retry.glacier.GetJobOutput: calling handler <bound method RetryHandler.needs_retry of <botocore.retries.standard.RetryHandler object at 0x7f706b9178d0>> 2020-06-16 14:55:09,056 - MainThread - botocore.retries.standard - DEBUG - Not retrying request. 2020-06-16 14:55:09,056 - MainThread - botocore.hooks - DEBUG - Event after-call.glacier.GetJobOutput: calling handler <bound method StreamingOutputArgument.save_file of <awscli.customizations.streamingoutputarg.StreamingOutputArgument object at 0x7f706bb09be0>> 2020-06-16 14:57:37,210 - MainThread - awscli.clidriver - DEBUG - Exception caught in main() Traceback (most recent call last): File "urllib3/response.py", line 437, in _error_catcher File "urllib3/response.py", line 519, in read File "http/client.py", line 447, in read File "http/client.py", line 491, in readinto File "socket.py", line 589, in readinto File "ssl.py", line 1052, in recv_into File "ssl.py", line 911, in read ConnectionResetError: [Errno 104] Connection reset by peer

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "awscli/clidriver.py", line 335, in main File "awscli/clidriver.py", line 507, in call File "awscli/clidriver.py", line 685, in call File "awscli/clidriver.py", line 806, in invoke File "awscli/clidriver.py", line 818, in _make_client_call File "botocore/client.py", line 208, in _api_call File "botocore/client.py", line 521, in _make_api_call File "botocore/hooks.py", line 227, in emit File "botocore/hooks.py", line 210, in _emit File "awscli/customizations/streamingoutputarg.py", line 107, in save_file File "botocore/response.py", line 78, in read File "urllib3/response.py", line 541, in read File "contextlib.py", line 130, in exit File "urllib3/response.py", line 455, in _error_catcher urllib3.exceptions.ProtocolError: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer')) 2020-06-16 14:57:37,211 - MainThread - awscli.clidriver - DEBUG - Exiting with rc 255

kdaily commented 4 years ago

Hi @samthar ,

I'm going to investigate this.

Can you tell me:

  1. What is the size the archive?
  2. Is this intermittent, or does occur repeatably with this file?
samthar commented 4 years ago

Thanks Kenneth. The archive is 38GB. So far I couldn’t get it downloading successfully. But through EC2, I could download it fine. Not sure where the problem is happening. Our networks say everything is fine on our end.

Please let me know if you need more information.

Thanks for all the help in advance.

Thanx Sam

Sent from my iPhone

On Jun 26, 2020, at 4:56 PM, Kenneth Daily notifications@github.com wrote:

 Hi @samthar ,

I'm going to investigate this.

Can you tell me:

What is the size the archive? Is this intermittent, or does occur repeatably with this file? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

kdaily commented 4 years ago

Hi @samthar ,

I think that this is likely a network error. Downloading a file of this size is prone to intermittent network issues. The documentation for this feature suggest using the --range parameter for downloading large files. It does not document what a large file is, but the example shows that 1 GB would be a reasonable partial download to use. Also, at the documentation for downloading a vault, it has the following statement:

For all but the largest archives (250 MB+), data accessed using Expedited retrievals are typically made available within 1–5 minutes.

This indicates that a 'large' archive can be considered 250MB. While that's not an official size threshold, it would be a good rule of thumb to consider for large archive such as yours. If you want to follow up more, I would suggest the AWS Forum for Glacier, as a service team engineer could advise better on managing archives of this size.

https://forums.aws.amazon.com/forum.jspa?forumID=140