aws-samples / lambda-ecs-worker-pattern

This example code illustrates how to extend AWS Lambda functionality using Amazon SQS and the Amazon EC2 Container Service (ECS).
Apache License 2.0
290 stars 45 forks source link

Capturing Errors within Container #3

Closed bollig closed 8 years ago

bollig commented 8 years ago

How do you propose we capture errors within containers/tasks? Perhaps it is worth posting to an SNS topic to get email alerts, but the alert itself is not as useful as the stderr/stdout logs. Is there any way to access "docker logs" via the CLI after a task stops?

I would also expect the ability to see the output from "docker logs" when I inspect the tasks on the AWS console.

glez-aws commented 8 years ago

At the moment, there are multiple options for capturing errors within a container. For example, this post shows how logs can be sent to CloudWatch logs for centralized processing: https://blogs.aws.amazon.com/application-management/post/TxFRDMTMILAA8X/Send-ECS-Container-Logs-to-CloudWatch-Logs-for-Centralized-Monitoring

However, this pattern code example assumes that the ECS cluster (including logs) is managed independently from this particular integration with Lambda so any real world implementation would need to integrate to whatever the centralized solution is.

If you’re looking for something to use right now and assuming that your ECS cluster is within your control, then I would propose to follow the solution outlined in the above blog post.

Hope this helps, Constantin

glez-aws commented 8 years ago

Closing issue, hoping the above answer is good enough. Otherwise, feel free to reopen.

bollig commented 8 years ago

To follow up: the link above is sufficient to get syslog info from inside a docker container, but stdout and stderr from the ENTRYPOINT and CMD do not route internally. My solution adds this as part of my TaskDefinition:

                "logConfiguration": {
                    "logDriver": "syslog", 
                    "options": {
                        "tag": "{{.ImageID}}/{{.ID}}"
                    }
                }, 

This way, stdout and stderr from all containers go to the ECS instance's syslog. Then I use the following UserScript (in my ECS cluster launch configuration) to route docker syslog traffic within each instance to /var/log/docker_run.log and prevent it from entering /var/log/messages. Both docker_run.log and messages are among the list pushed to CloudWatch with the awslogs agent.

Content-Type: multipart/mixed; boundary="==BOUNDARY=="
MIME-Version: 1.0

--==BOUNDARY==
MIME-Version: 1.0
Content-Type: text/text/x-shellscript; charset="us-ascii"
#!/bin/bash

# Trigger inclusion into ECS Cluster (default)
echo ECS_CLUSTER=default >> /etc/ecs/ecs.config
echo ECS_LOGLEVEL=debug >> /etc/ecs/ecs.config

# Install the awslogs agent
yum install -y awslogs jq

# Inject the CloudWatch Logs configuration file contents
cat > /etc/awslogs/awslogs.conf <<- EOF
[general]
state_file = /var/lib/awslogs/agent-state        

[/var/log/dmesg]
file = /var/log/dmesg
log_group_name = {cluster}-cluster-admin
log_stream_name = instance--{container_instance_id}--{instance_id}--{ip_address}-dmesg

[/var/log/messages]
file = /var/log/messages
log_group_name = {cluster}-cluster-admin
log_stream_name = instance--{container_instance_id}--{instance_id}--{ip_address}-messages
datetime_format = %b %d %H:%M:%S

[/var/log/docker]
file = /var/log/docker
log_group_name = {cluster}-cluster-admin
log_stream_name = instance--{container_instance_id}--{instance_id}--{ip_address}-docker
datetime_format = %Y-%m-%dT%H:%M:%S.%f

[/var/log/ecs/ecs-init.log]
file = /var/log/ecs/ecs-init.log.*
log_group_name = {cluster}-cluster-admin
log_stream_name = instance--{container_instance_id}--{instance_id}--{ip_address}-ecs-init.log
datetime_format = %Y-%m-%dT%H:%M:%SZ

[/var/log/ecs/ecs-agent.log]
file = /var/log/ecs/ecs-agent.log.*
log_group_name = {cluster}-cluster-admin
log_stream_name = instance--{container_instance_id}--{instance_id}--{ip_address}-ecs-agent.log
datetime_format = %Y-%m-%dT%H:%M:%SZ

[/var/log/docker_run.log]
file = /var/log/docker_run.log
log_group_name = job-logs
log_stream_name = instance--{container_instance_id}--{instance_id}--{ip_address}
datetime_format = %b %d %H:%M:%S
EOF

# Tell rsyslog to log docker command syslogs to /var/log/docker_run.log
cat > /etc/rsyslog.d/10-docker.conf <<- EOF
# /etc/rsyslog.d/10-docker.conf
\$template DockerLogs, "/var/log/docker_run.log"
if \$programname == 'docker' then -?DockerLogs
& stop
EOF

--==BOUNDARY==
MIME-Version: 1.0
Content-Type: text/text/upstart-job; charset="us-ascii"

#upstart-job
description "Configure and start CloudWatch Logs agent on Amazon ECS container instance"
author "Amazon Web Services"
start on started ecs

script
    exec 2>>/var/log/ecs/cloudwatch-logs-start.log
    set -x

    until curl -s http://localhost:51678/v1/metadata
    do  
        sleep 1
    done

    # Grab the cluster and container instance ARN from instance metadata
    cluster=$(curl -s http://localhost:51678/v1/metadata | jq -r '. | .Cluster')
    container_instance_id=$(curl -s http://localhost:51678/v1/metadata | jq -r '. | .ContainerInstanceArn' | awk -F/ '{print $2}' )

    # Replace the cluster name and container instance ID placeholders with the actual values
    sed -i -e "s/{cluster}/$cluster/g" /etc/awslogs/awslogs.conf
    sed -i -e "s/{container_instance_id}/$container_instance_id/g" /etc/awslogs/awslogs.conf

    # Ensure that docker exec logs ONLY go to /var/log/docker_run.log and not /var/log/messages
    sed -i -e "s/authpriv.none/authpriv.none,docker.none/" /etc/rsyslog.conf

    service rsyslog restart
    service awslogs start
    chkconfig awslogs on
end script
--==BOUNDARY==--
glez-aws commented 8 years ago

Thank for sharing!