aws / amazon-ecs-agent

Amazon Elastic Container Service Agent
http://aws.amazon.com/ecs/
Apache License 2.0
2.08k stars 608 forks source link

ECS Agent remains Connected=False #240

Closed sathiyas closed 8 years ago

sathiyas commented 8 years ago

The agent is able to register with ECS Cluster and status is showing as ACTIVE. But Agent connected is showing as false. Hence I can't run tasks. I could register a task definition.

I am behind corp Proxy. I am passing the extra variable https_proxy as follows. Until I passed Proxy, registration was not successful, but after https proxy, regsitration was good. Agent is latest from Docker Hub

docker run --name ecs-agent \ --detach=true \ --restart=on-failure:10 \ --volume=/var/run/docker.sock:/var/run/docker.sock \ --volume=/var/log/ecs/:/log \ --volume=/var/lib/ecs/data:/data \ --volume=/sys/fs/cgroup:/sys/fs/cgroup:ro \ --volume=/var/run/docker/execdriver/native:/var/lib/docker/execdriver/native:ro \ --publish=127.0.0.1:51678:51678 \ --env=ECS_LOGFILE=/log/ecs-agent.log \ --env=ECS_LOGLEVEL=info \ --env=ECS_DATADIR=/data \ --env=ECS_CLUSTER=$1 \ --env=https_proxy=http://x.x.x.x.:yyyy \ private_repo/amazon-ecs-agent:101915

samuelkarp commented 8 years ago

Can you look at the Agent logs and post them here? You should find them in /var/log/ecs or by running docker logs ecs-agent. Note also that you must also pass --env="NO_PROXY=169.254.169.254,/var/run/docker.sock".

sathiyas commented 8 years ago

/var/ecs/log 2015-10-29T21:00:01Z [ERROR] Error connecting to TCS: dial tcp 54.239.21.73:443: i/o timeout module="tcs handler" 2015-10-29T21:00:01Z [INFO] Error from tcs; backing off module="tcs handler" err="dial tcp 54.239.21.73:443: i/o timeout" 2015-10-29T21:01:04Z [INFO] Creating poll dialer module="ws client" host="ecs-t-2.us-east-1.amazonaws.com" 2015-10-29T21:01:07Z [ERROR] Error connecting to TCS: dial tcp 54.239.21.73:443: i/o timeout module="tcs handler" 2015-10-29T21:01:07Z [INFO] Error from tcs; backing off module="tcs handler" err="dial tcp 54.239.21.73:443: i/o timeout" 2015-10-29T21:02:02Z [INFO] Creating poll dialer module="ws client" host="ecs-a-2.us-east-1.amazonaws.com" 2015-10-29T21:02:05Z [ERROR] Error connecting to ACS: dial tcp 54.239.19.128:443: i/o timeout module="acs handler" 2015-10-29T21:02:05Z [INFO] Error from acs; backing off module="acs handler" err="dial tcp 54.239.19.128:443: i/o timeout" 2015-10-29T21:02:16Z [INFO] Creating poll dialer module="ws client" host="ecs-t-2.us-east-1.amazonaws.com" 2015-10-29T21:02:20Z [ERROR] Error connecting to TCS: dial tcp 54.239.21.88:443: i/o timeout module="tcs handler" 2015-10-29T21:02:20Z [INFO] Error from tcs; backing off module="tcs handler" err="dial tcp 54.239.21.88:443: i/o timeout" 2015-10-29T21:03:26Z [INFO] Creating poll dialer module="ws client" host="ecs-t-2.us-east-1.amazonaws.com" 2015-10-29T21:03:29Z [ERROR] Error connecting to TCS: dial tcp 54.239.21.73:443: i/o timeout module="tcs handler" 2015-10-29T21:03:29Z [INFO] Error from tcs; backing off module="tcs handler" err="dial tcp 54.239.21.73:443: i/o timeout" 2015-10-29T21:04:24Z [INFO] Creating poll dialer module="ws client" host="ecs-a-2.us-east-1.amazonaws.com" 2015-10-29T21:04:27Z [ERROR] Error connecting to ACS: dial tcp 54.239.20.16:443: i/o timeout module="acs handler" 2015-10-29T21:04:27Z [INFO] Error from acs; backing off module="acs handler" err="dial tcp 54.239.20.16:443: i/o timeout" 2015-10-29T21:04:29Z [INFO] Creating poll dialer module="ws client" host="ecs-t-2.us-east-1.amazonaws.com" 2015-10-29T21:04:33Z [ERROR] Error connecting to TCS: dial tcp 54.239.21.73:443: i/o timeout module="tcs handler" 2015-10-29T21:04:33Z [INFO] Error from tcs; backing off module="tcs handler" err="dial tcp 54.239.21.73:443: i/o timeout"

samuelkarp commented 8 years ago

Can you include the complete log? Can you also verify that you're running Agent v1.6.0 (which we just released today)?

sathiyas commented 8 years ago

amazon/amazon-ecs-agent latest 63559db13ecd 3 days ago 9.036 MB

sathiyas commented 8 years ago

The docker pull says image is 3 days old, should I pull on a version in stead of latest?

sathiyas commented 8 years ago

2015-10-29T21:00:01Z [ERROR] Error connecting to TCS: dial tcp 54.239.21.73:443: i/o timeout module="tcs handler" 2015-10-29T21:00:01Z [INFO] Error from tcs; backing off module="tcs handler" err="dial tcp 54.239.21.73:443: i/o timeout" 2015-10-29T21:01:04Z [INFO] Creating poll dialer module="ws client" host="ecs-t-2.us-east-1.amazonaws.com" 2015-10-29T21:01:07Z [ERROR] Error connecting to TCS: dial tcp 54.239.21.73:443: i/o timeout module="tcs handler" 2015-10-29T21:01:07Z [INFO] Error from tcs; backing off module="tcs handler" err="dial tcp 54.239.21.73:443: i/o timeout" 2015-10-29T21:02:02Z [INFO] Creating poll dialer module="ws client" host="ecs-a-2.us-east-1.amazonaws.com" 2015-10-29T21:02:05Z [ERROR] Error connecting to ACS: dial tcp 54.239.19.128:443: i/o timeout module="acs handler" 2015-10-29T21:02:05Z [INFO] Error from acs; backing off module="acs handler" err="dial tcp 54.239.19.128:443: i/o timeout" 2015-10-29T21:02:16Z [INFO] Creating poll dialer module="ws client" host="ecs-t-2.us-east-1.amazonaws.com" 2015-10-29T21:02:20Z [ERROR] Error connecting to TCS: dial tcp 54.239.21.88:443: i/o timeout module="tcs handler" 2015-10-29T21:02:20Z [INFO] Error from tcs; backing off module="tcs handler" err="dial tcp 54.239.21.88:443: i/o timeout" 2015-10-29T21:03:26Z [INFO] Creating poll dialer module="ws client" host="ecs-t-2.us-east-1.amazonaws.com" 2015-10-29T21:03:29Z [ERROR] Error connecting to TCS: dial tcp 54.239.21.73:443: i/o timeout module="tcs handler" 2015-10-29T21:03:29Z [INFO] Error from tcs; backing off module="tcs handler" err="dial tcp 54.239.21.73:443: i/o timeout" 2015-10-29T21:04:24Z [INFO] Creating poll dialer module="ws client" host="ecs-a-2.us-east-1.amazonaws.com" 2015-10-29T21:04:27Z [ERROR] Error connecting to ACS: dial tcp 54.239.20.16:443: i/o timeout module="acs handler" 2015-10-29T21:04:27Z [INFO] Error from acs; backing off module="acs handler" err="dial tcp 54.239.20.16:443: i/o timeout" 2015-10-29T21:04:29Z [INFO] Creating poll dialer module="ws client" host="ecs-t-2.us-east-1.amazonaws.com" 2015-10-29T21:04:33Z [ERROR] Error connecting to TCS: dial tcp 54.239.21.73:443: i/o timeout module="tcs handler" 2015-10-29T21:04:33Z [INFO] Error from tcs; backing off module="tcs handler" err="dial tcp 54.239.21.73:443: i/o timeout" 2015-10-29T21:05:42Z [INFO] Creating poll dialer module="ws client" host="ecs-t-2.us-east-1.amazonaws.com" 2015-10-29T21:05:45Z [ERROR] Error connecting to TCS: dial tcp 54.239.21.88:443: i/o timeout module="tcs handler" 2015-10-29T21:05:45Z [INFO] Error from tcs; backing off module="tcs handler" err="dial tcp 54.239.21.88:443: i/o timeout" 2015-10-29T21:06:39Z [INFO] Creating poll dialer module="ws client" host="ecs-a-2.us-east-1.amazonaws.com" 2015-10-29T21:06:42Z [ERROR] Error connecting to ACS: dial tcp 54.239.19.128:443: i/o timeout module="acs handler" 2015-10-29T21:06:42Z [INFO] Error from acs; backing off module="acs handler" err="dial tcp 54.239.19.128:443: i/o timeout" 2015-10-29T21:06:52Z [INFO] Creating poll dialer module="ws client" host="ecs-t-2.us-east-1.amazonaws.com" 2015-10-29T21:06:55Z [ERROR] Error connecting to TCS: dial tcp 54.239.21.73:443: i/o timeout module="tcs handler" 2015-10-29T21:06:55Z [INFO] Error from tcs; backing off module="tcs handler" err="dial tcp 54.239.21.73:443: i/o timeout" 2015-10-29T21:07:38Z [INFO] Saving state! module="statemanager" 2015-10-29T21:08:27Z [INFO] Starting Agent: Amazon ECS Agent - v1.5.0 (b197edd) 2015-10-29T21:08:27Z [INFO] Loading configuration 2015-10-29T21:08:27Z [INFO] Checkpointing is enabled. Attempting to load state 2015-10-29T21:08:27Z [INFO] Loading state! module="statemanager" 2015-10-29T21:08:27Z [INFO] Restored cluster 'chassis_ecs_poc' 2015-10-29T21:08:27Z [INFO] Detected Docker versions [1.17 1.18 1.19 1.20] 2015-10-29T21:08:27Z [INFO] Restored from checkpoint file. I am running as 'arn:aws:ecs:us-east-1:625275122486:container-instance/74420201-7708-4b9d-97ec-0c06bf5dbe53' in cluster 'chassis_ecs_poc' 2015-10-29T21:08:28Z [INFO] Registered! module="api client" 2015-10-29T21:08:28Z [INFO] Saving state! module="statemanager" 2015-10-29T21:08:28Z [INFO] Beginning Polling for updates 2015-10-29T21:08:28Z [INFO] Initializing stats engine module="stats" 2015-10-29T21:08:28Z [INFO] Creating poll dialer module="ws client" host="ecs-a-2.us-east-1.amazonaws.com" 2015-10-29T21:08:28Z [INFO] Creating poll dialer module="ws client" host="ecs-t-2.us-east-1.amazonaws.com" 2015-10-29T21:08:31Z [ERROR] Error connecting to ACS: dial tcp 54.239.19.128:443: i/o timeout module="acs handler" 2015-10-29T21:08:31Z [INFO] Error from acs; backing off module="acs handler" err="dial tcp 54.239.19.128:443: i/o timeout" 2015-10-29T21:08:31Z [ERROR] Error connecting to TCS: dial tcp 54.239.21.88:443: i/o timeout module="tcs handler"

sathiyas commented 8 years ago

amazon/amazon-ecs-agent latest 63559db13ecd 3 days ago 9.036 MB The docker pull says image is 3 days old, should I pull on a version in stead of latest?

sathiyas commented 8 years ago

I pulled v1.6.0 and running, it shows connected, may be pls put latest and 1.6.0 as same, I will test and close the issue

euank commented 8 years ago

I verified that when I use the latest image (e.g. docker pull amazon/amazon-ecs-agent:latest && docker run -v /var/run/docker.sock:/var/run/docker.sock amazon/amazon-ecs-agent:latest -version) it prints out 1.6.0 as the version. The image ID you pasted above (63559db13ecd) is also what I see for both latest and v1.6.0. Although we published it today, the "3 days ago" Docker provides is when it was built and is correct.

The log message you show does appear to either be from an older agent or doesn't have a proxy successfully configured (it should have a proxy= message in Creating poll dialer in a new properly-configured agent).

Are you sure there's not any other difference between the successful and unsuccessful runs? Can you verify the version via the -version flag and that the environment variables are the same?

Thanks, Euan

sathiyas commented 8 years ago

When I ran latest, version showed as 1.5 , however I will check again if there's something I might have overlooked

Thx

Thanks

Sathiya Shunmugasundaram Lead Software Engineer - Digital IT 804-381-9252

-----Original Message----- From: euan [notifications@github.commailto:notifications@github.com] Sent: Thursday, October 29, 2015 05:47 PM Eastern Standard Time To: aws/amazon-ecs-agent Cc: Shunmugasundaram, Sathiya Subject: Re: [amazon-ecs-agent] ECS Agent remains Connected=False (#240)

I verified that when I use the latest image (e.g. docker pull amazon/amazon-ecs-agent:latest && docker run -v /var/run/docker.sock:/var/run/docker.sock amazon/amazon-ecs-agent:latest -version) it prints out 1.6.0 as the version. The image ID you pasted above (63559db13ecd) is also what I see for both latest and v1.6.0. Although we published it today, the "3 days ago" Docker provides is when it was built and is correct.

The log message you show does appear to either be from an older agent or doesn't have a proxy successfully configured (it should have a proxy= message in Creating poll dialer in a new properly-configured agent).

Are you sure there's not any other difference between the successful and unsuccessful runs? Can you verify the version via the -version flag and that the environment variables are the same?

Thanks, Euan

— Reply to this email directly or view it on GitHubhttps://github.com/aws/amazon-ecs-agent/issues/240#issuecomment-152337668.


The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.


The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.

samuelkarp commented 8 years ago

Closing for now, please let us know if you continue to have problems.