aws / aws-cli

Universal Command Line Interface for Amazon Web Services
Other
15.56k stars 4.13k forks source link

aws sync hangs #2477

Closed ixodie closed 4 years ago

ixodie commented 7 years ago

I am consistently seeing aws cli fail during sync to s3 with the following command:

aws s3 sync --size-only --page-size 100 /mnt/ebs-volume/image/ s3://bucket-name

ubuntu@ip-10-0-0-246:~/www$ aws --v aws-cli/1.11.56 Python/2.7.12 Linux/4.4.0-64-generic botocore/1.5.19

It runs well for the first gig and then hangs. This is a 50 gig filesystem:

Completed 1.0 GiB/~1.0 GiB (1.8 MiB/s) with ~4 file(s) remaining (calculating...upload: ../..//img_2630_thumb.png to s3://bucket/image.png Completed 1.0 GiB/~1.0 GiB (1.8 MiB/s) with ~3 file(s) remaining (calculating...Completed 1.0 GiB/~1.0 GiB (1.8 MiB/s) with ~3 file(s) remaining (calculating...Completed 1.0 GiB/~1.0 GiB (1.8 MiB/s) with ~3 file(s) remaining (calculating...upload: ../../img_2630.png to s3://bucket/img_2630.png Completed 1.0 GiB/~1.0 GiB (1.8 MiB/s) with ~2 file(s) remaining (calculating...Completed 1.0 GiB/~1.0 GiB (1.8 MiB/s) with ~2 file(s) remaining (calculating...upload: ../../img_2628.png to s3://bucket/img_2628.png Completed 1.0 GiB/~1.0 GiB (1.8 MiB/s) with ~1 file(s) remaining (calculating...Completed 1.0 GiB/~1.0 GiB (1.8 MiB/s) with ~1 file(s) remaining (calculating...upload: ../../image/img_2628_thumb.png to s3://bucket/img_2628_thumb.png Completed 1.0 GiB/~1.0 GiB (1.8 MiB/s) with ~0 file(s) remaining (calculating...

And then it just sits there.

I'm really not sure what to check at this point as the cli is not very verbose.

JordonPhillips commented 7 years ago

You can get more verbose logs by using --debug. From what you've provided I'm not sure what's wrong. If you can provide me with those debug logs I should be able to work out what's happening.

gravyboat commented 7 years ago

I've encountered this as well via some automated scripts on aws-cli/1.11.67 Python/2.7.12 Linux/4.4.0-24-generic botocore/1.5.30, unfortunately it doesn't happen very often (2-3 times over thousands of individual sync commands) so dumping the debug output of that would be millions of lines. @ixodie have you been able to duplicate this again or get the debug log going by any chance? It would be cool if we could get this resolved so other people don't run into it.

ixodie commented 7 years ago

This still doesn't work. In fact it appears that it has never worked properly. The data is 50GB and it has seemingly only transferred in 1.5GB according to Cloudwatch. We run the same script every morning:

/home/ubuntu/.local/bin/aws s3 sync --size-only --exclude "log.txt" /mnt/ebs-volume/image/ s3://xxxx-xxx-xxxx

I just ran this manually and included --debug and now it is running far longer than it has in the past. I inserted the --debug right after the "log.txt".

I cannot attach the debug as it is a huge amount of text data.

ixodie commented 7 years ago

OK it just hung about 10 minutes into it. Lots of generic debug messages but this pops up a bit in the logs. After 5 minutes just sitting there it continues the transfer.

2017-04-12 17:49:45,422 - MainThread - awscli.customizations.s3.filters - DEBUG - /mnt/ebs-volume/image/364/thumb/2012-06-19_12-09-00_505.jpg did not match clude filter: image-2/log.txt
2017-04-12 17:49:45,422 - MainThread - awscli.customizations.s3.filters - DEBUG - =/mnt/ebs-volume/image/364/thumb/2012-06-19_12-09-00_505.jpg final filtered status, should_include: True
2017-04-12 17:49:45,422 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.s3.ListObjects: calling handler <function set_list_objects_encoding_type_url at 0x7f34d81f4c08>
2017-04-12 17:49:45,422 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.s3.ListObjects: calling handler <function validate_bucket_name at 0x7f34d81f8b90>
2017-04-12 17:49:45,422 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.s3.ListObjects: calling handler <bound method S3RegionRedirector.redirect_from_cache of <botocore.utils.S3RegionRedirector object at 0x7f34d746df90>>
2017-04-12 17:49:45,422 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.s3.ListObjects: calling handler <function generate_idempotent_uuid at 0x7f34d81f8848>
2017-04-12 17:49:45,423 - MainThread - botocore.hooks - DEBUG - Event before-call.s3.ListObjects: calling handler <function add_expect_header at 0x7f34d81f4050>
2017-04-12 17:49:45,423 - MainThread - botocore.hooks - DEBUG - Event before-call.s3.ListObjects: calling handler <bound method S3RegionRedirector.set_request_url of <botocore.utils.S3RegionRedirector object at 0x7f34d746df90>>
2017-04-12 17:49:45,423 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=ListObjects) (verify_ssl=True) with params: {'body': '', 'url': u'https://s3.amazonaws.com/image-2?marker=gs%2F364%2F2012-06-19_12-09-00_505.jpg&prefix=&encoding-type=url', 'headers': {'User-Agent': 'aws-cli/1.11.56 Python/2.7.12 Linux/4.4.0-72-generic botocore/1.5.19'}, 'context': {'encoding_type_auto_set': True, 'client_region': 'us-east-1', 'signing': {'bucket': u'reefs-image-2'}, 'has_streaming_input': False, 'client_config': <botocore.config.Config object at 0x7f34d7669750>}, 'query_string': {u'marker': u'gs/364/2012-06-19_12-09-00_505.jpg', u'prefix': u'', u'encoding-type': 'url'}, 'url_path': u'/image-2', 'method': u'GET'}
2017-04-12 17:49:45,423 - MainThread - botocore.hooks - DEBUG - Event request-created.s3.ListObjects: calling handler <function disable_upload_callbacks at 0x7f34d7d7e9b0>
2017-04-12 17:49:45,423 - MainThread - botocore.hooks - DEBUG - Event request-created.s3.ListObjects: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x7f34d78d4d90>>
2017-04-12 17:49:45,423 - MainThread - botocore.hooks - DEBUG - Event before-sign.s3.ListObjects: calling handler <function fix_s3_host at 0x7f34d87bfcf8>
2017-04-12 17:49:45,424 - MainThread - botocore.utils - DEBUG - Checking for DNS compatible bucket for: https://s3.amazonaws.com/image-2?marker=gs%2F364%2F2012-06-19_12-09-00_505.jpg&prefix=&encoding-type=url
2017-04-12 17:49:45,424 - MainThread - botocore.utils - DEBUG - URI updated to: https://image-2.s3.amazonaws.com/?marker=gs%2F364%2F2012-06-19_12-09-00_505.jpg&prefix=&encoding-type=url
2017-04-12 17:49:45,424 - MainThread - botocore.auth - DEBUG - Calculating signature using hmacv1 auth.
2017-04-12 17:49:45,424 - MainThread - botocore.auth - DEBUG - HTTP request method: GET
2017-04-12 17:49:45,424 - MainThread - botocore.auth - DEBUG - StringToSign:
GET

Wed, 12 Apr 2017 17:49:45 GMT
/image-2/
2017-04-12 17:49:45,424 - MainThread - botocore.hooks - DEBUG - Event request-created.s3.ListObjects: calling handler <function enable_upload_callbacks at 0x7f34d7d7ea28>
2017-04-12 17:49:45,425 - MainThread - botocore.endpoint - DEBUG - Sending http request: <PreparedRequest [GET]>
2017-04-12 17:49:45,425 - MainThread - botocore.vendored.requests.packages.urllib3.connectionpool - INFO - Resetting dropped connection: image-2.s3.amazonaws.com
2017-04-12 17:49:45,567 - MainThread - botocore.vendored.requests.packages.urllib3.connectionpool - DEBUG - "GET /?marker=gs%2F364%2F2012-06-19_12-09-00_505.jpg&prefix=&encoding-type=url HTTP/1.1" 200 None
2017-04-12 17:49:45,573 - MainThread - botocore.parsers - DEBUG - Response headers: {'x-amz-bucket-region': 'us-east-1', 'x-amz-id-2': 'DENG+nxxxxxxxxs2KSUXQp8vgxj+TP5x3Aahxxxxxxxx3jDnaYQ/quVTbAxxxxxxxx7umfPXXj6VfdoQ=', 'server': 'AmazonS3', 'transfer-encoding': 'chunked', 'x-amz-request-id': 'D83293F150174F01', 'date': 'Wed, 12 Apr 2017 17:49:46 GMT', 'content-type': 'application/xml'}
2017-04-12 17:49:45,573 - MainThread - botocore.parsers - DEBUG - Response body:
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
gravyboat commented 7 years ago

@ixodie So did it actually complete the full 50GB when you included the --debug option? Or did it still only transfer 1.5GB? Also thanks for providing these logs, the way my code works would have made it extremely tedious to grab the appropriate output.

ixodie commented 7 years ago

It is still running, but I believe it has gotten much further than anything I have tried previously. So adding the --debug to the command seems to have fixed something?

ixodie commented 7 years ago

Here is another data point. The bucket was previously copied to S3 with a third party tool s3cmd. The size of the old bucket is the top image and the size of the new bucket (with aws s3 sync) is the bottom. Clearly the upload is not working properly.

screen shot 2017-04-13 at 12 58 22 pm screen shot 2017-04-13 at 12 57 58 pm

ixodie commented 7 years ago

I wonder what is different about my environment that prevents this sync from working.

gravyboat commented 7 years ago

I still don't believe this is unique to your environment as I've encountered it multiple times.

ixodie commented 7 years ago

That's crazy. I thought this was an officially supported package from AWS.

ixodie commented 7 years ago

I'm still amazed that I am the only person who finds this major issue.

It is only a 50GB file system, and the sync never goes beyond 1.5GB.

gravyboat commented 7 years ago

@ixodie So previously we were using the aws sync like you are and that's where I noticed the issue where the sync hangs (as part of a bash script). We recently swapped over to using the ruby gem for aws-sdk and we haven't encountered this at all so that might be an option for you to write a simple wrapper that the cron job triggers to see if it still occurs. I'm wondering if a lot of people aren't noticing it because they either don't sync 'large' amounts of files, or maybe they perform retries or something.

Dmitry1987 commented 7 years ago

weird we're getting this all of a sudden on machines that work and do aws s3 sync for 2 years... it just started to hand after complete upload. It uploads the backup - and never exit, so all bash scripts that have s3 sync in them, never exit. will try alternative versions of aws-cli...

ahgindia commented 6 years ago

I am facing the same issue currently. I am trying to sync whole folder with lots of small sized files to S3 bucket. After syncing about 1.5 GB of files, it just stops syncing and no output or message seen in aws s3 sync command as well. I have also tried with debug option as suggested earlier in this issue, but same result. It does not show anything after this line :

Completed 1.5 GB/~1.5 GB (408.7 KiB/s) with ~0 file(s) remaining (calculating...)

I have waited for about an hour to complete the script or show some error, but nothing as output, it is still running and waiting to complete.

How can I track what is going after above last line, or how to find what exactly it is processing ?

This issue was posted before more than 9 months, but there is no update from any of the official members of this github repository. Its very unresponsive behaviour not providing any answer or solution to this.

Thanks, Ankit

gravyboat commented 6 years ago

@ahgindia If you're able to duplicate it using only 1.5GB of files (this is way smaller than where we've seen it) and you aren't getting any debug output how about you start the process, then strace the output and dump that to a file? It would probably be a ton of data to sift through, but then you could review it for when it stops and see if there is anything relevant within the strace output and we'd get a much better idea of what is happening when this issue occurs on the system level.

ahgindia commented 6 years ago

@gravyboat Total size of sync operation is around 200GB and not only 1.5 GB. But sync stopped only after syncing 1.5 GB of files. I tried to find out with debug flag, where exactly it is stopping. Below is what was seen in the log when the sync operation was not doing anything, no debug messages after this for a long time. My aws version is aws-cli/1.14.16 Python/2.7.5 Linux/3.10.0-514.10.2.el7.x86_64 botocore/1.8.20

2018-01-12 06:44:10,970 - MainThread - botocore.hooks - DEBUG - Event needs-retry.s3.ListObjects: calling handler <botocore.retryhandler.RetryHandler object at 0x3027810> 2018-01-12 06:44:10,970 - MainThread - botocore.retryhandler - DEBUG - No retry needed. 2018-01-12 06:44:10,970 - MainThread - botocore.hooks - DEBUG - Event needs-retry.s3.ListObjects: calling handler <bound method S3RegionRedirector.redirect_from_error of <botocore.utils.S3RegionRedirector object at 0x3027890>> 2018-01-12 06:44:10,971 - MainThread - botocore.hooks - DEBUG - Event after-call.s3.ListObjects: calling handler <function decode_list_object at 0x1f09938> 2018-01-12 06:44:10,975 - MainThread - botocore.hooks - DEBUG - Event after-call.s3.ListObjects: calling handler <function enhance_error_msg at 0x27fe578>

Then after approx 20 minutes, it started again with following debug log :

2018-01-12 07:06:54,452 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.s3.ListObjects: calling handler <function set_list_objects_encoding_type_url at 0x1f098c0> 2018-01-12 07:06:54,452 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.s3.ListObjects: calling handler <function validate_bucket_name at 0x1f07848> 2018-01-12 07:06:54,452 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.s3.ListObjects: calling handler <bound method S3RegionRedirector.redirect_from_cache of <botocore.utils.S3RegionRedirector object at 0x3027890>> 2018-01-12 07:06:54,453 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.s3.ListObjects: calling handler <function generate_idempotent_uuid at 0x1f07500>

Let me know if this debug log is of any help to find out the issue.

One more thing to note here, yesterday I just kept the aws s3 sync command to run inside screen and let it complete by itself. After hanging in between, it was completed with exit status of 1 after very long time of pause in between. So I assume that it failed to transfer some of files in between and completed the sync operation with exit status of 1. Ideally if full sync completes successfully, exit status should be 0.

Thanks, Ankit

djdevin commented 6 years ago

Same thing here, I am watching it sit after:

Completed 6.4 MiB/~6.4 MiB (79.8 KiB/s) with ~0 file(s) remaining (calculating...)

then eventually exit.

aws-cli/1.11.133 Python/2.7.5 Linux/3.10.0-514.6.1.el7.x86_64 botocore/1.6.0

gravyboat commented 6 years ago

@djdevin Can you run an strace against the process since you're transferring so little data and see if there's anything worthwhile there?

ramyogi commented 6 years ago

Any suggestion. I am facing this issue. upload 40GB files. Files are all uploaded as expected but process hangs not completing.

gadelkareem commented 6 years ago

Here too with just few small files

2018-03-22 14:05:13,242 - MainThread - botocore.hooks - DEBUG - Event needs-retry.s3.ListObjects: calling handler <bound method S3RegionRedirector.redirect_from_error of <botocore.utils.S3RegionRedirector object at 0x7f425a3ee5d0>>
2018-03-22 14:05:13,280 - MainThread - botocore.hooks - DEBUG - Event after-call.s3.ListObjects: calling handler <function decode_list_object at 0x7f425bbcd050>
2018-03-22 14:05:13,370 - MainThread - botocore.hooks - DEBUG - Event after-call.s3.ListObjects: calling handler <function enhance_error_msg at 0x7f425deac9b0>
Killed

It works sometimes on the same EC2 but mostly hangs and killed!

gadelkareem commented 6 years ago

Got it! It was running out of memory. Creating 2G swapfile solved the problem!

gravyboat commented 6 years ago

I can confirm when this was occurring for me it was not a memory issue.

externl commented 6 years ago

For me it was because I was logging to my bucket, there were just too many logs files when trying to sync from the top level of the bucket. Too me a few days to delete the log files, everything worked fine after.

ajaceves commented 5 years ago

Issue is occurring for me with the newest version of awscli on Ubuntu 18.04. I am not running out of memory.

NicMAlexandre commented 4 years ago

Hi I'm also getting aws hanging after installing aws with conda

I can confirm the aws command works and was successfully installed

conda create -n aws source activate aws conda install -c conda-forge awscli

aws s3 --no-sign-request sync s3://genomeark/species/Calypte_anna/bCalAnn1/genomic_data/10x/ .

2020-05-27 13:04:14,170 - MainThread - botocore.utils - DEBUG - Caught retryable HTTP exception while making metadata service request to http://169.254.169.254/latest/api/token: Could not connect to the endpoint URL: "http://169.254.169.254/latest/api/token" Traceback (most recent call last): File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn conn = connection.create_connection( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/util/connection.py", line 84, in create_connection raise err File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/util/connection.py", line 74, in create_connection sock.connect(sa) OSError: [Errno 113] No route to host

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/httpsession.py", line 254, in send urllib_response = conn.urlopen( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connectionpool.py", line 724, in urlopen retries = retries.increment( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/util/retry.py", line 379, in increment raise six.reraise(type(error), error, _stacktrace) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/packages/six.py", line 735, in reraise raise value File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen httplib_response = self._make_request( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connectionpool.py", line 392, in _make_request conn.request(method, url, **httplib_request_kw) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 1230, in request self._send_request(method, url, body, headers, encode_chunked) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/awsrequest.py", line 91, in _send_request rval = super(AWSConnection, self)._send_request( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 1276, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 1225, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/awsrequest.py", line 119, in _send_output self.send(msg) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/awsrequest.py", line 203, in send return super(AWSConnection, self).send(str) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 944, in send self.connect() File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connection.py", line 187, in connect conn = self._new_conn() File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connection.py", line 171, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <botocore.awsrequest.AWSHTTPConnection object at 0x2b4daf6192e0>: Failed to establish a new connection: [Errno 113] No route to host

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/utils.py", line 296, in _fetch_metadata_token response = self._session.send(request.prepare()) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/httpsession.py", line 283, in send raise EndpointConnectionError(endpoint_url=request.url, error=e) botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "http://169.254.169.254/latest/api/token" 2020-05-27 13:04:14,173 - MainThread - urllib3.connectionpool - DEBUG - Starting new HTTP connection (4): 169.254.169.254:80 2020-05-27 13:04:15,175 - MainThread - botocore.utils - DEBUG - Caught retryable HTTP exception while making metadata service request to http://169.254.169.254/latest/meta-data/iam/security-credentials/: Connect timeout on endpoint URL: "http://169.254.169.254/latest/meta-data/iam/security-credentials/" Traceback (most recent call last): File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn conn = connection.create_connection( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/util/connection.py", line 84, in create_connection raise err File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/util/connection.py", line 74, in create_connection sock.connect(sa) socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/httpsession.py", line 254, in send urllib_response = conn.urlopen( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connectionpool.py", line 724, in urlopen retries = retries.increment( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/util/retry.py", line 379, in increment raise six.reraise(type(error), error, _stacktrace) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/packages/six.py", line 735, in reraise raise value File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen httplib_response = self._make_request( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connectionpool.py", line 392, in _make_request conn.request(method, url, **httplib_request_kw) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 1230, in request self._send_request(method, url, body, headers, encode_chunked) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/awsrequest.py", line 91, in _send_request rval = super(AWSConnection, self)._send_request( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 1276, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 1225, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/awsrequest.py", line 119, in _send_output self.send(msg) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/awsrequest.py", line 203, in send return super(AWSConnection, self).send(str) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 944, in send self.connect() File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connection.py", line 187, in connect conn = self._new_conn() File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connection.py", line 164, in _new_conn raise ConnectTimeoutError( urllib3.exceptions.ConnectTimeoutError: (<botocore.awsrequest.AWSHTTPConnection object at 0x2b4daf619700>, 'Connection to 169.254.169.254 timed out. (connect timeout=1)')

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/utils.py", line 339, in _get_request response = self._session.send(request.prepare()) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/httpsession.py", line 287, in send raise ConnectTimeoutError(endpoint_url=request.url, error=e) botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: "http://169.254.169.254/latest/meta-data/iam/security-credentials/" 2020-05-27 13:04:15,177 - MainThread - botocore.utils - DEBUG - Max number of attempts exceeded (1) when attempting to retrieve data from metadata service. 2020-05-27 13:04:15,177 - MainThread - botocore.hooks - DEBUG - Event choose-service-name: calling handler <function handle_service_name_alias at 0x2b4dae65bc10> 2020-05-27 13:04:15,180 - MainThread - botocore.hooks - DEBUG - Event creating-client-class.s3: calling handler <function add_generate_presigned_post at 0x2b4dae62ba60> 2020-05-27 13:04:15,181 - MainThread - botocore.hooks - DEBUG - Event creating-client-class.s3: calling handler <function add_generate_presigned_url at 0x2b4dae62b820> 2020-05-27 13:04:15,186 - MainThread - botocore.endpoint - DEBUG - Setting s3 timeout as (60, 60) 2020-05-27 13:04:15,188 - MainThread - botocore.client - DEBUG - Registering retry handlers for service: s3 2020-05-27 13:04:15,190 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: env 2020-05-27 13:04:15,191 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: assume-role 2020-05-27 13:04:15,191 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: assume-role-with-web-identity 2020-05-27 13:04:15,191 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: shared-credentials-file 2020-05-27 13:04:15,191 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: custom-process 2020-05-27 13:04:15,191 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: config-file 2020-05-27 13:04:15,192 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: ec2-credentials-file 2020-05-27 13:04:15,192 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: boto-config 2020-05-27 13:04:15,192 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: container-role 2020-05-27 13:04:15,192 - MainThread - botocore.credentials - DEBUG - Looking for credentials via: iam-role 2020-05-27 13:04:15,193 - MainThread - urllib3.connectionpool - DEBUG - Starting new HTTP connection (5): 169.254.169.254:80 2020-05-27 13:04:16,194 - MainThread - botocore.utils - DEBUG - Caught retryable HTTP exception while making metadata service request to http://169.254.169.254/latest/api/token: Connect timeout on endpoint URL: "http://169.254.169.254/latest/api/token" Traceback (most recent call last): File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn conn = connection.create_connection( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/util/connection.py", line 84, in create_connection raise err File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/util/connection.py", line 74, in create_connection sock.connect(sa) socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/httpsession.py", line 254, in send urllib_response = conn.urlopen( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connectionpool.py", line 724, in urlopen retries = retries.increment( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/util/retry.py", line 379, in increment raise six.reraise(type(error), error, _stacktrace) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/packages/six.py", line 735, in reraise raise value File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen httplib_response = self._make_request( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connectionpool.py", line 392, in _make_request conn.request(method, url, **httplib_request_kw) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 1230, in request self._send_request(method, url, body, headers, encode_chunked) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/awsrequest.py", line 91, in _send_request rval = super(AWSConnection, self)._send_request( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 1276, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 1225, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/awsrequest.py", line 119, in _send_output self.send(msg) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/awsrequest.py", line 203, in send return super(AWSConnection, self).send(str) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 944, in send self.connect() File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connection.py", line 187, in connect conn = self._new_conn() File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connection.py", line 164, in _new_conn raise ConnectTimeoutError( urllib3.exceptions.ConnectTimeoutError: (<botocore.awsrequest.AWSHTTPConnection object at 0x2b4daf66f5e0>, 'Connection to 169.254.169.254 timed out. (connect timeout=1)')

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/utils.py", line 296, in _fetch_metadata_token response = self._session.send(request.prepare()) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/httpsession.py", line 287, in send raise ConnectTimeoutError(endpoint_url=request.url, error=e) botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: "http://169.254.169.254/latest/api/token" 2020-05-27 13:04:16,197 - MainThread - urllib3.connectionpool - DEBUG - Starting new HTTP connection (6): 169.254.169.254:80 2020-05-27 13:04:17,180 - MainThread - botocore.utils - DEBUG - Caught retryable HTTP exception while making metadata service request to http://169.254.169.254/latest/meta-data/iam/security-credentials/: Could not connect to the endpoint URL: "http://169.254.169.254/latest/meta-data/iam/security-credentials/" Traceback (most recent call last): File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn conn = connection.create_connection( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/util/connection.py", line 84, in create_connection raise err File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/util/connection.py", line 74, in create_connection sock.connect(sa) OSError: [Errno 113] No route to host

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/httpsession.py", line 254, in send urllib_response = conn.urlopen( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connectionpool.py", line 724, in urlopen retries = retries.increment( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/util/retry.py", line 379, in increment raise six.reraise(type(error), error, _stacktrace) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/packages/six.py", line 735, in reraise raise value File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen httplib_response = self._make_request( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connectionpool.py", line 392, in _make_request conn.request(method, url, **httplib_request_kw) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 1230, in request self._send_request(method, url, body, headers, encode_chunked) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/awsrequest.py", line 91, in _send_request rval = super(AWSConnection, self)._send_request( File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 1276, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 1225, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/awsrequest.py", line 119, in _send_output self.send(msg) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/awsrequest.py", line 203, in send return super(AWSConnection, self).send(str) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/http/client.py", line 944, in send self.connect() File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connection.py", line 187, in connect conn = self._new_conn() File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/urllib3/connection.py", line 171, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <botocore.awsrequest.AWSHTTPConnection object at 0x2b4daf66f9d0>: Failed to establish a new connection: [Errno 113] No route to host

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/utils.py", line 339, in _get_request response = self._session.send(request.prepare()) File "/global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/httpsession.py", line 283, in send raise EndpointConnectionError(endpoint_url=request.url, error=e) botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "http://169.254.169.254/latest/meta-data/iam/security-credentials/" 2020-05-27 13:04:17,182 - MainThread - botocore.utils - DEBUG - Max number of attempts exceeded (1) when attempting to retrieve data from metadata service. 2020-05-27 13:04:17,183 - MainThread - botocore.hooks - DEBUG - Event choose-service-name: calling handler <function handle_service_name_alias at 0x2b4dae65bc10> 2020-05-27 13:04:17,186 - MainThread - botocore.hooks - DEBUG - Event creating-client-class.s3: calling handler <function add_generate_presigned_post at 0x2b4dae62ba60> 2020-05-27 13:04:17,186 - MainThread - botocore.hooks - DEBUG - Event creating-client-class.s3: calling handler <function add_generate_presigned_url at 0x2b4dae62b820> 2020-05-27 13:04:17,193 - MainThread - botocore.endpoint - DEBUG - Setting s3 timeout as (60, 60) 2020-05-27 13:04:17,195 - MainThread - botocore.client - DEBUG - Registering retry handlers for service: s3 2020-05-27 13:04:17,198 - MainThread - awscli.customizations.s3.s3handler - DEBUG - Using a multipart threshold of 8388608 and a part size of 8388608 2020-05-27 13:04:17,199 - MainThread - botocore.hooks - DEBUG - Event choosing-s3-sync-strategy: calling handler <bound method BaseSync.use_sync_strategy of <awscli.customizations.s3.syncstrategy.sizeonly.SizeOnlySync object at 0x2b4daf15c940>> 2020-05-27 13:04:17,199 - MainThread - botocore.hooks - DEBUG - Event choosing-s3-sync-strategy: calling handler <bound method BaseSync.use_sync_strategy of <awscli.customizations.s3.syncstrategy.exacttimestamps.ExactTimestampsSync object at 0x2b4daf15c8b0>> 2020-05-27 13:04:17,199 - MainThread - botocore.hooks - DEBUG - Event choosing-s3-sync-strategy: calling handler <bound method BaseSync.use_sync_strategy of <awscli.customizations.s3.syncstrategy.delete.DeleteSync object at 0x2b4daf15cb20>> 2020-05-27 13:04:17,437 - MainThread - botocore.loaders - DEBUG - Loading JSON file: /global/scratch/nalexandre/anaconda3/envs/aws/lib/python3.8/site-packages/botocore/data/s3/2006-03-01/paginators-1.json 2020-05-27 13:04:17,438 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.s3.ListObjectsV2: calling handler <function set_list_objects_encoding_type_url at 0x2b4dae68b4c0> 2020-05-27 13:04:17,438 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.s3.ListObjectsV2: calling handler <function validate_bucket_name at 0x2b4dae6891f0> 2020-05-27 13:04:17,438 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.s3.ListObjectsV2: calling handler <bound method S3RegionRedirector.redirect_from_cache of <botocore.utils.S3RegionRedirector object at 0x2b4daf6c8220>> 2020-05-27 13:04:17,438 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.s3.ListObjectsV2: calling handler <bound method S3ArnParamHandler.handle_arn of <botocore.utils.S3ArnParamHandler object at 0x2b4daf6c82e0>> 2020-05-27 13:04:17,438 - MainThread - botocore.hooks - DEBUG - Event before-parameter-build.s3.ListObjectsV2: calling handler <function generate_idempotent_uuid at 0x2b4dae687dc0> 2020-05-27 13:04:17,439 - MainThread - botocore.hooks - DEBUG - Event before-call.s3.ListObjectsV2: calling handler <function add_expect_header at 0x2b4dae689550> 2020-05-27 13:04:17,439 - MainThread - botocore.hooks - DEBUG - Event before-call.s3.ListObjectsV2: calling handler <bound method S3RegionRedirector.set_request_url of <botocore.utils.S3RegionRedirector object at 0x2b4daf6c8220>> 2020-05-27 13:04:17,439 - MainThread - botocore.hooks - DEBUG - Event before-call.s3.ListObjectsV2: calling handler <function inject_api_version_header_if_needed at 0x2b4dae68b8b0> 2020-05-27 13:04:17,439 - MainThread - botocore.endpoint - DEBUG - Making request for OperationModel(name=ListObjectsV2) with params: {'url_path': '/genomeark?list-type=2', 'query_string': {'prefix': 'species/Calypte_anna/bCalAnn1/genomic_data/10x/', 'encoding-type': 'url'}, 'method': 'GET', 'headers': {'User-Agent': 'aws-cli/1.18.67 Python/3.8.2 Linux/3.10.0-693.11.6.el7.x86_64 botocore/1.16.17'}, 'body': b'', 'url': 'https://s3.amazonaws.com/genomeark?list-type=2&prefix=species%2FCalypte_anna%2FbCalAnn1%2Fgenomic_data%2F10x%2F&encoding-type=url', 'context': {'client_region': 'us-east-1', 'client_config': <botocore.config.Config object at 0x2b4daf68b700>, 'has_streaming_input': False, 'auth_type': None, 'encoding_type_auto_set': True, 'signing': {'bucket': 'genomeark'}}} 2020-05-27 13:04:17,440 - MainThread - botocore.hooks - DEBUG - Event request-created.s3.ListObjectsV2: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x2b4daf68b670>> 2020-05-27 13:04:17,440 - MainThread - botocore.hooks - DEBUG - Event choose-signer.s3.ListObjectsV2: calling handler <bound method ClientCreator._default_s3_presign_to_sigv2 of <botocore.client.ClientCreator object at 0x2b4daf66f0d0>> 2020-05-27 13:04:17,440 - MainThread - botocore.hooks - DEBUG - Event choose-signer.s3.ListObjectsV2: calling handler <function set_operation_specific_signer at 0x2b4dae687ca0> 2020-05-27 13:04:17,440 - MainThread - botocore.hooks - DEBUG - Event choose-signer.s3.ListObjectsV2: calling handler <function disable_signing at 0x2b4dae6894c0> 2020-05-27 13:04:17,440 - MainThread - botocore.hooks - DEBUG - Event before-sign.s3.ListObjectsV2: calling handler <bound method S3EndpointSetter.set_endpoint of <botocore.utils.S3EndpointSetter object at 0x2b4daf6c8370>> 2020-05-27 13:04:17,440 - MainThread - botocore.utils - DEBUG - Defaulting to S3 virtual host style addressing with path style addressing fallback. 2020-05-27 13:04:17,441 - MainThread - botocore.utils - DEBUG - Checking for DNS compatible bucket for: https://s3.amazonaws.com/genomeark?list-type=2&prefix=species%2FCalypte_anna%2FbCalAnn1%2Fgenomic_data%2F10x%2F&encoding-type=url 2020-05-27 13:04:17,441 - MainThread - botocore.utils - DEBUG - URI updated to: https://genomeark.s3.amazonaws.com/?list-type=2&prefix=species%2FCalypte_anna%2FbCalAnn1%2Fgenomic_data%2F10x%2F&encoding-type=url 2020-05-27 13:04:17,441 - MainThread - botocore.endpoint - DEBUG - Sending http request: <AWSPreparedRequest stream_output=False, method=GET, url=https://genomeark.s3.amazonaws.com/?list-type=2&prefix=species%2FCalypte_anna%2FbCalAnn1%2Fgenomic_data%2F10x%2F&encoding-type=url, headers={'User-Agent': b'aws-cli/1.18.67 Python/3.8.2 Linux/3.10.0-693.11.6.el7.x86_64 botocore/1.16.17'}> 2020-05-27 13:04:17,442 - MainThread - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): genomeark.s3.amazonaws.com:443 2020-05-27 13:04:17,813 - MainThread - urllib3.connectionpool - DEBUG - https://genomeark.s3.amazonaws.com:443 "GET /?list-type=2&prefix=species%2FCalypte_anna%2FbCalAnn1%2Fgenomic_data%2F10x%2F&encoding-type=url HTTP/1.1" 200 None 2020-05-27 13:04:17,814 - MainThread - botocore.parsers - DEBUG - Response headers: {'x-amz-id-2': '0EALvW6oD2AodN+gySAE4Y8EMvUierp4ffpelG8hIEgbgHPfgEVzccIRpHspXy+rZPnE4Ey/EEA=', 'x-amz-request-id': 'B5E3C3907243A2A0', 'Date': 'Wed, 27 May 2020 20:04:18 GMT', 'x-amz-bucket-region': 'us-east-1', 'Content-Type': 'application/xml', 'Transfer-Encoding': 'chunked', 'Server': 'AmazonS3'} 2020-05-27 13:04:17,814 - MainThread - botocore.parsers - DEBUG - Response body: b'<?xml version="1.0" encoding="UTF-8"?>\ngenomearkspecies/Calypte_anna/bCalAnn1/genomic_data/10x/31000urlfalsespecies/Calypte_anna/bCalAnn1/genomic_data/10x/bCalAnn1_S1_L001_I1_001.fastq.gz2018-04-18T13:39:10.000Z"062c3189c89b6a96db228aa058c0642d-94"1570364925STANDARDspecies/Calypte_anna/bCalAnn1/genomic_data/10x/bCalAnn1_S1_L001_R1_001.fastq.gz2018-04-12T14:36:54.000Z"73ddb4e2a725b14cca712fb9c799add2-1982"16620848900STANDARDspecies/Calypte_anna/bCalAnn1/genomic_data/10x/bCalAnn1_S1_L001_R2_001.fastq.gz2018-04-05T03:06:02.000Z"113ec98df2470aea4adbb57b8ab7338f-2023"16961818985STANDARD' 2020-05-27 13:04:17,952 - MainThread - botocore.hooks - DEBUG - Event needs-retry.s3.ListObjectsV2: calling handler <botocore.retryhandler.RetryHandler object at 0x2b4daf6c81c0> 2020-05-27 13:04:17,952 - MainThread - botocore.retryhandler - DEBUG - No retry needed. 2020-05-27 13:04:17,953 - MainThread - botocore.hooks - DEBUG - Event needs-retry.s3.ListObjectsV2: calling handler <bound method S3RegionRedirector.redirect_from_error of <botocore.utils.S3RegionRedirector object at 0x2b4daf6c8220>> 2020-05-27 13:04:17,953 - MainThread - botocore.hooks - DEBUG - Event after-call.s3.ListObjectsV2: calling handler <function decode_list_object_v2 at 0x2b4dae68b5e0> 2020-05-27 13:04:17,953 - MainThread - botocore.hooks - DEBUG - Event after-call.s3.ListObjectsV2: calling handler <function enhance_error_msg at 0x2b4daf0a54c0>

abcydybcy commented 4 years ago

I have the same problem.

Running on aws-cli/1.14.5 Python/3.8.2 Linux/4.14.173-137.229.amzn2.x86_64 botocore/1.8.9

Command aws s3 sync --debug --delete crawl/ "s3://$BUCKET"

2020-06-03 10:59:35,084 - ThreadPoolExecutor-0_3 - botocore.vendored.requests.packages.urllib3.connectionpool - DEBUG - "PUT /dep2-s3bucketstaging-10dr98ixwg0tq/wp-json/oembed/1.0/embed%3Furl%3Dhttp%3A%252F%252F34.253.159.232%252Fwordpress-aws%252F HTTP/1.1" 200 0
2020-06-03 10:59:35,085 - ThreadPoolExecutor-0_3 - botocore.parsers - DEBUG - Response headers: {'x-amz-id-2': 'zefCeHOaf0z/dGOgj0BKjf9Wpw8ooEWqn9qG3sG2FY7mGWPmWkWD0AIuQuKybmsN5TyKi//DSv8=', 'x-amz-request-id': '64480FD3E3BE38E8', 'date': 'Wed, 03 Jun 2020 10:59:36 GMT', 'etag': '"023e02d6f4c48ec920cd97e29a1dcc40"', 'content-length': '0', 'server': 'AmazonS3'}
2020-06-03 10:59:35,085 - ThreadPoolExecutor-0_3 - botocore.parsers - DEBUG - Response body:
b''
2020-06-03 10:59:35,086 - ThreadPoolExecutor-0_3 - botocore.hooks - DEBUG - Event needs-retry.s3.PutObject: calling handler <botocore.retryhandler.RetryHandler object at 0x7faac402e0a0>
2020-06-03 10:59:35,086 - ThreadPoolExecutor-0_3 - botocore.retryhandler - DEBUG - No retry needed.
2020-06-03 10:59:35,086 - ThreadPoolExecutor-0_3 - botocore.hooks - DEBUG - Event needs-retry.s3.PutObject: calling handler <bound method S3RegionRedirector.redirect_from_error of <botocore.utils.S3RegionRedirector object at 0x7faac402e100>>
2020-06-03 10:59:35,086 - ThreadPoolExecutor-0_3 - botocore.hooks - DEBUG - Event after-call.s3.PutObject: calling handler <function enhance_error_msg at 0x7faac43ddc10>
2020-06-03 10:59:35,087 - ThreadPoolExecutor-0_3 - s3transfer.utils - DEBUG - Releasing acquire 434/None
vladignatyev commented 4 years ago

The problem is still here. Seems that TLS connection hangs for some reason. Looks like the Python's socket related thing.

Environment: aws-cli/2.0.0dev1 Python/3.7.4 Darwin/19.4.0 botocore/2.0.0dev1

Log output during aws2 sync <folder> <bucket_uri>:


2020-06-21 15:16:23,351 - ThreadPoolExecutor-0_4 - botocore.parsers - DEBUG - Response body:
b'<?xml version="1.0" encoding="UTF-8"?>\n<Error><Code>RequestTimeout</Code><Message>Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.</Message><RequestId>D3A1154182FC0A08</RequestId><HostId>7sCx/xJQsKkN89pYIzQeNdJM9o8vIe638WbteOMZ29/Seys2K4uOnq2oVdd315DhOLJh2RQH7pU=</HostId></Error>'
2020-06-21 15:16:23,352 - ThreadPoolExecutor-0_4 - botocore.hooks - DEBUG - Event needs-retry.s3.PutObject: calling handler <botocore.retryhandler.RetryHandler object at 0x111742f50>
2020-06-21 15:16:23,352 - ThreadPoolExecutor-0_4 - botocore.retryhandler - DEBUG - retry needed: matching HTTP status and error code seen: 400, RequestTimeout
2020-06-21 15:16:23,352 - ThreadPoolExecutor-0_4 - botocore.retryhandler - DEBUG - Reached the maximum number of retry attempts: 5
2020-06-21 15:16:23,352 - ThreadPoolExecutor-0_4 - botocore.retryhandler - DEBUG - No retry needed.
2020-06-21 15:16:23,352 - ThreadPoolExecutor-0_4 - botocore.hooks - DEBUG - Event needs-retry.s3.PutObject: calling handler <bound method S3RegionRedirector.redirect_from_error of <botocore.utils.S3RegionRedirector object at 0x111742e90>>
2020-06-21 15:16:23,353 - ThreadPoolExecutor-0_4 - botocore.hooks - DEBUG - Event after-call.s3.PutObject: calling handler <function enhance_error_msg at 0x110948cb0>
2020-06-21 15:16:23,353 - ThreadPoolExecutor-0_4 - s3transfer.tasks - DEBUG - Exception raised.
Traceback (most recent call last):
  File "site-packages/s3transfer/tasks.py", line 126, in __call__
  File "site-packages/s3transfer/tasks.py", line 150, in _execute_main
  File "site-packages/s3transfer/upload.py", line 692, in _main
  File "site-packages/botocore/client.py", line 357, in _api_call
  File "site-packages/botocore/client.py", line 661, in _make_api_call
botocore.exceptions.ClientError: An error occurred (RequestTimeout) when calling the PutObject operation (reached max retries: 4): Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.
2020-06-21 15:16:23,354 - ThreadPoolExecutor-0_4 - s3transfer.utils - DEBUG - Releasing acquire 26/None
upload failed: _site/uncaught-domexception-btoa-on-window/index.html to s3://base64tool.com/uncaught-domexception-btoa-on-window/index.html An error occurred (RequestTimeout) when calling the PutObject operation (reached max retries: 4): Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.
2020-06-21 15:16:26,928 - ThreadPoolExecutor-0_1 - urllib3.connectionpool - DEBUG - https://s3.us-west-1.amazonaws.com:443 "PUT /base64tool.com/typeerror-bytes-like-object-required-when-b64encode/index.html HTTP/1.1" 400 None
2020-06-21 15:16:26,929 - ThreadPoolExecutor-0_1 - botocore.parsers - DEBUG - Response headers: {'x-amz-request-id': '958AD230B3F563E1', 'x-amz-id-2': 'e2Dg85YFr8e5ISqggvOLbcrJ0fpVy5NU4n5rBSVALgQctBihrGLXA9/LAACBg7lxd4U2FnPa1ic=', 'Content-Type': 'application/xml', 'Transfer-Encoding': 'chunked', 'Date': 'Sun, 21 Jun 2020 11:16:05 GMT', 'Connection': 'close', 'Server': 'AmazonS3'}
2020-06-21 15:16:26,929 - ThreadPoolExecutor-0_1 - botocore.parsers - DEBUG - Response body:
b'<?xml version="1.0" encoding="UTF-8"?>\n<Error><Code>RequestTimeout</Code><Message>Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.</Message><RequestId>958AD230B3F563E1</RequestId><HostId>e2Dg85YFr8e5ISqggvOLbcrJ0fpVy5NU4n5rBSVALgQctBihrGLXA9/LAACBg7lxd4U2FnPa1ic=</HostId></Error>'```
vladignatyev commented 4 years ago

Just tried this version aws-cli/2.0.24 Python/3.7.4 Darwin/19.4.0 botocore/2.0.0dev28 and still have the problem.

esetnik commented 4 years ago

I am also having problems with aws s3 sync hanging indefinitely, but I don't see any errors in the logs when I run it with --debug. Strangely enough the sync only hangs when running on Fargate v1.4.0 and exact same command works fine on Fargate v1.3.0.

2502101454 commented 4 years ago

my command was: aws s3 sync . s3://wz-machine-learning/test

I am facing this issue too, I find some list bucket logs in debug option, also there are very very amounts of files under ‘s3://wz-machine-learning/test’ ; I am guess that the sync command will iterate over all of keys under your s3-key, its a huge task when amounts of files there; When I change to another cleaner s3-key, issue was fixed

kdaily commented 4 years ago

This is quite an old issue with a number of threads and symptoms. In order to effectively follow up on these, please open new tickets using our issue templates (https://github.com/aws/aws-cli/issues/new/choose) to provide more detail on the scenario where this occurs, including the size and number of files that are involved.

@esetnik if this continues to occur in Fargate, please open a separate ticket to follow up on.

nicolasalexandre21 is this still occurring for you? If so, please open a separate ticket with information about the size of sync you are trying to perform.

@vladignatyev if this is occurring for you, please open a new ticket with more details.

@abcydybcy does this occur for you with a more recent version of the AWS CLI? Version 1.14.5 is almost three years old. Please open a new issue if this still occurs.

Thanks!

github-actions[bot] commented 4 years ago

Greetings! It looks like this issue hasn’t been active in longer than a week. We encourage you to check if this is still an issue in the latest release. Because it has been longer than a week since the last update on this, and in the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or add an upvote to prevent automatic closure, or if the issue is already closed, please feel free to open a new one.

NicMAlexandre commented 4 years ago

Hello,

You can close this issue if I posted it. I found a way around the issue.

Thank you,

Nicolas

On Tue, Sep 22, 2020 at 11:10 AM github-actions[bot] < notifications@github.com> wrote:

Greetings! It looks like this issue hasn’t been active in longer than a week. We encourage you to check if this is still an issue in the latest release. Because it has been longer than a week since the last update on this, and in the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or add an upvote to prevent automatic closure, or if the issue is already closed, please feel free to open a new one.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/aws/aws-cli/issues/2477#issuecomment-696891086, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFB6337BS5TDZYBAY5XIBWDSHDSC7ANCNFSM4DCTQJIA .

-- Best,

Nicolas Alexandre PhD Candidate, Integrative Biology Whiteman Lab University of California - Berkeley nalexandre@berkeley.edu kiv@berkeley.edu

antonycc commented 2 years ago

I see this issue is closed but it still occurs for me on using both aws s3 sync and aws s3 cp with --recursive.

% aws --version
aws-cli/2.3.2 Python/3.9.7 Darwin/21.1.0 source/x86_64 prompt/off

Where I would expect the sync to terminate, instead I am held at:

Completed 307.2 KiB/~307.2 KiB (270.3 KiB/s) with ~0 file(s) remaining (calculating...)

The workaround that I am using to progress is to combine, aws s3api list-objects-v2 with a single item copy using aws s3 cp. For a given, bucket, prefix and target directory:

s3_crawl_bucket='an-s3-source-bucket'
s3_crawl_prefix='the-source-data/prefix/on/s3'
test_data_dir='my-local-data'

Instead of this:

aws s3 cp "s3://${s3_crawl_bucket}" "${test_data_dir?}" --recursive --exclude "*" --include "${s3_crawl_prefix?}*"

The following script is functionally equivalent:

aws s3api list-objects-v2 --bucket "${s3_crawl_bucket}" --prefix "${s3_crawl_prefix}" --output json \
    | jq '.Contents[] | .Key' --raw-output \
    | parallel --jobs 32 aws s3 cp "s3://${s3_crawl_bucket}/{}" "${test_data_dir?}/{}" \
    ;

(parallel --jobs 32 offsets the extended execution time of the individual copy.)

felipelalli commented 1 year ago

It's 2023 and this issue is closed but it still occurs for me on using both aws s3 sync and aws s3 cp with --recursive. :cry:

It shows and hangs it up: Completed 25.0 MiB/~25.0 MiB (10.2 MiB/s) with ~0 file(s) remaining (calculating...)

and I have to press "CTRL+C" to my bashscript continue to the next command.

mkjawadi commented 1 year ago

I see this issue is closed but it still occurs for me on using both aws s3 sync and aws s3 cp with --recursive. I have tried the --exclude flag for a directory with a huge number of files but still it hangs. Any solution/work around? Thanks.

saxonww commented 2 months ago

One thing that took me a while to realize is that while I was trying to copy just a few files, the way I set the copy up made aws s3 sync consider much more of the bucket content than it needed to.

So for example, you have a directory structure like this:

content/
content/identifier/
content/identifier/subdir1/
content/identifier/subdir1/file.txt
content/identifier/subdir2/
content/identifier/subdir2/file.txt
content/identifier/subdir3/
content/identifier/subdir3/file.txt

You have an S3 bucket where identifier above is the prefix you want. When developing this soution, you might run:

aws s3 sync content s3://bucket

It runs quickly, sync is nearly immediate, works great!

Later on, you notice that the copy is extremely slow. It takes a long time to get started, and it seems to hang at the end. I did, and it brought me here; for ~80 files totaling ~15MB, aws s3 sync was 'hanging' for several minutes after copying all the files. It's because of how I wrote the command: over time, I accumulated a lot of identifier prefixes in the bucket, and the simple constructon of the aws s3 sync command above made the CLI consider all of them.

Instead, write the command like this:

aws s3 sync content/identifier s3://bucket/identifier

This reduces the scope of what sync has to deal with substantially. It changed an 8-10 minute process into a sub-10-second process for me.

namnguyenhai commented 2 weeks ago

It's 2023 and this issue is closed but it still occurs for me on using both aws s3 sync and aws s3 cp with --recursive. 😢

It shows and hangs it up: Completed 25.0 MiB/~25.0 MiB (10.2 MiB/s) with ~0 file(s) remaining (calculating...)

and I have to press "CTRL+C" to my bashscript continue to the next command.

I have same error with you. It's not always.

It's 2023 and this issue is closed but it still occurs for me on using both aws s3 sync and aws s3 cp with --recursive. 😢

It shows and hangs it up: Completed 25.0 MiB/~25.0 MiB (10.2 MiB/s) with ~0 file(s) remaining (calculating...)

and I have to press "CTRL+C" to my bashscript continue to the next command.

Any update guys ? I have same issue. It's not always but must press CTRL+C to continue