aws / amazon-ssm-agent

An agent to enable remote management of your EC2 instances, on-premises servers, or virtual machines (VMs).
https://aws.amazon.com/systems-manager/
Apache License 2.0
1.06k stars 324 forks source link

aws ecs execute-command fails with TargetNotConnectedException error on EC2 #565

Closed RomanIzvozchikov closed 7 months ago

RomanIzvozchikov commented 7 months ago

Hello!

I am running ECS cluster on EC2 instances. I have started ECS service that created ECS task. I am trying to connect to ECS task using aws ecs execute-command command, but it fails with this exception:

An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later.

EC2 instance type: t4g.small EC2 instance AMI: ami-0c6ec2a0a1beaee8c EC2 instance SSM agent version: 3.2.2303.0

My ECS task is configured with a Task role, that contains required permissions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "ssmmessages:OpenDataChannel",
                "ssmmessages:OpenControlChannel",
                "ssmmessages:CreateDataChannel",
                "ssmmessages:CreateControlChannel"
            ],
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "ECSExec"
        }
    ]
}

ECS Exec is enabled in my service:

aws ecs describe-tasks \
    --cluster <cluster_name> \
    --tasks <task_name>

...
"enableExecuteCommand": true
...

Session manager plugin is installed

session-manager-plugin

The Session Manager plugin was installed successfully. Use the AWS CLI to start a session.

I don't have any red lines when I execute check-ecs-exec.sh script.

./check-ecs-exec.sh dev <task_id>
-------------------------------------------------------------
Prerequisites for check-ecs-exec.sh v0.7
-------------------------------------------------------------
  jq      | OK (/usr/local/bin/jq)
  AWS CLI | OK (/usr/local/bin/aws)

-------------------------------------------------------------
Prerequisites for the AWS CLI to use ECS Exec
-------------------------------------------------------------
  AWS CLI Version        | OK (aws-cli/2.13.19 Python/3.11.5 Darwin/23.4.0 exe/x86_64 prompt/off)
  Session Manager Plugin | OK (1.2.553.0)

-------------------------------------------------------------
Checks on ECS task and other resources
-------------------------------------------------------------
Region : eu-west-1
Cluster: dev
Task   : <task_id>
-------------------------------------------------------------
  Cluster Configuration  |
     KMS Key       : Not Configured
     Audit Logging : DEFAULT
     S3 Bucket Name: Not Configured
     CW Log Group  : Not Configured
  Can I ExecuteCommand?  | <my_role_arn>
     ecs:ExecuteCommand: allowed
     ssm:StartSession denied?: allowed
  Task Status            | RUNNING
  Launch Type            | EC2
  ECS Agent Version      | 1.82.2
  Exec Enabled for Task  | OK
  Container-Level Checks | 
    ----------
      Managed Agent Status
    ----------
         1. RUNNING for "<container_name>"
    ----------
      Init Process Enabled (<container_name>:5)
    ----------
         1. Enabled - "<container_name>"
    ----------
      Read-Only Root Filesystem (<container_name>:5)
    ----------
         1. Disabled - "<container_name>"
  Task Role Permissions  | <task_role_arn>
     ssmmessages:CreateControlChannel: allowed
     ssmmessages:CreateDataChannel: allowed
     ssmmessages:OpenControlChannel: allowed
     ssmmessages:OpenDataChannel: allowed
  VPC Endpoints          | SKIPPED (<vpc_id> - No additional VPC endpoints required)
  Environment Variables  | (<container_name>:5)
       1. container "<container_name>"
       - AWS_ACCESS_KEY: not defined
       - AWS_ACCESS_KEY_ID: not defined
       - AWS_SECRET_ACCESS_KEY: not defined

I can connect to the instance that hosts my ECS task using aws ssm start-session command:

aws ssm start-session --target <instance_id>

I cannot connect to my container task using aws ssm start-session command.

aws ssm start-session --target ecs:dev_<task_id>_<container_id> --parameters '{"command":["/bin/bash"]}'

An error occurred (TargetNotConnected) when calling the StartSession operation: ecs:dev_<task_id>_<container_id> is not connected.

I noticed that Task role created for ECS Task has never been used (contains '-' in 'Last activity' field in AWS console).

Please help me to solve this problem. Any help and mentions are welcome!

ziwangj commented 7 months ago

Thanks for reaching out. Could you please try running the ECS task with --enable-execute-command to see if it would help solve this issue? If not, can you please provide the below information for further investigation?

  1. Your task id and container id
  2. The agent and SSMCLI version: https://docs.aws.amazon.com/systems-manager/latest/userguide/plugin-version-history.html https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent-get-version.html
  3. The agent log: https://docs.aws.amazon.com/systems-manager/latest/userguide/sysman-agent-logs.html
RomanIzvozchikov commented 7 months ago

@ziwangj thanks a lot for your support! I found the root cause of this issue. I didn't added Security Group rule that permits outbound traffic from ECS Service. This security group is applied to ECS Service.