aws / amazon-cloudwatch-agent

CloudWatch Agent enables you to collect and export host-level metrics and logs on instances running Linux or Windows server.
MIT License
445 stars 204 forks source link

amazon/cloudwatch-agent:1.0.3 doesn't start on Fargate #1055

Closed tijmenb closed 7 months ago

tijmenb commented 8 months ago

Describe the bug

Version amazon/cloudwatch-agent:latest (amazon/cloudwatch-agent:1.0.3), doesn't start on ECS/Fargate.

Steps to reproduce

We're using this task definition (main container definitions omitted):

{
    "taskDefinitionArn": "arn:aws:ecs:eu-west-1:REDACTED:task-definition/scribe-api:560",
    "containerDefinitions": [
        // redacted
        {
            "name": "cloudwatch-agent",
            "image": "amazon/cloudwatch-agent:latest",
            "cpu": 0,
            "portMappings": [],
            "essential": true,
            "environment": [
                {
                    "name": "CW_CONFIG_CONTENT",
                    "value": "{\"metrics\":{\"metrics_collected\":{\"statsd\":{\"metrics_aggregation_interval\":300,\"metrics_collection_interval\":15,\"service_address\":\":8125\"}},\"namespace\":\"YYY\"}}"
                }
            ],
            "mountPoints": [],
            "volumesFrom": [],
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "/YYY",
                    "awslogs-region": "eu-west-1",
                    "awslogs-stream-prefix": "ecs"
                }
            }
        }
    ],
    "family": "scribe-api",
    "taskRoleArn": "arn:aws:iam::REDACTED:role/YYYTaskRole",
    "executionRoleArn": "arn:aws:iam::REDACTED:role/YYYECSExecutionRole",
    "networkMode": "awsvpc",
    "revision": 560,
    "volumes": [
        {
            "name": "project-files",
            "efsVolumeConfiguration": {
                "fileSystemId": "fs-REDACTED",
                "rootDirectory": "/"
            }
        }
    ],
    "status": "ACTIVE",
    "requiresAttributes": [
        {
            "name": "ecs.capability.execution-role-awslogs"
        },
        {
            "name": "com.amazonaws.ecs.capability.ecr-auth"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
        },
        {
            "name": "com.amazonaws.ecs.capability.logging-driver.awsfirelens"
        },
        {
            "name": "com.amazonaws.ecs.capability.task-iam-role"
        },
        {
            "name": "ecs.capability.execution-role-ecr-pull"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
        },
        {
            "name": "ecs.capability.task-eni"
        },
        {
            "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
        },
        {
            "name": "ecs.capability.efsAuth"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
        },
        {
            "name": "ecs.capability.firelens.fluentbit"
        },
        {
            "name": "ecs.capability.secrets.asm.environment-variables"
        },
        {
            "name": "ecs.capability.efs"
        },
        {
            "name": "com.amazonaws.ecs.capability.docker-remote-api.1.25"
        },
        {
            "name": "ecs.capability.secrets.asm.bootstrap.log-driver"
        }
    ],
    "placementConstraints": [],
    "compatibilities": [
        "EC2",
        "FARGATE"
    ],
    "requiresCompatibilities": [
        "FARGATE"
    ],
    "cpu": "2048",
    "memory": "4096",
    "registeredAt": "2024-02-23T10:51:44.870Z",
    "registeredBy": "REDACTED",
}

What did you expect to see?

The container to start.

What did you see instead?

The cointainer did not start:

{"level":"info","ts":"2024-02-23T10:10:34Z","msg":"Starting the Amazon CloudWatch Agent Operator","amazon-cloudwatch-agent-operator":"","cloudwatch-agent":"public.ecr.aws/cloudwatch-agent/cloudwatch-agent:0.0.0","auto-instrumentation-java":"public.ecr.aws/aws-observability/adot-autoinstrumentation-java:0.0.0","auto-instrumentation-python":"ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:0.0.0","build-date":"","go-version":"go1.20.14","go-arch":"amd64","go-os":"linux"}

{"level":"error","ts":"2024-02-23T10:10:34Z","logger":"controller-runtime.client.config","msg":"unable to load in-cluster config","error":"unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined","stacktrace":"sigs.k8s.io/controller-runtime/pkg/client/config.loadConfig.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/client/config/config.go:133\nsigs.k8s.io/controller-runtime/pkg/client/config.loadConfig\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/client/config/config.go:155\nsigs.k8s.io/controller-runtime/pkg/client/config.GetConfigWithContext\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/client/config/config.go:97\nsigs.k8s.io/controller-runtime/pkg/client/config.GetConfig\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/client/config/config.go:77\nsigs.k8s.io/controller-runtime/pkg/client/config.GetConfigOrDie\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/client/config/config.go:175\nmain.main\n\t/workspace/main.go:173\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"}

{"level":"error","ts":"2024-02-23T10:10:34Z","logger":"controller-runtime.client.config","msg":"unable to get kubeconfig","error":"invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable","errorCauses":[{"error":"no configuration has been provided, try setting KUBERNETES_MASTER environment variable"}],"stacktrace":"sigs.k8s.io/controller-runtime/pkg/client/config.GetConfigOrDie\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/client/config/config.go:177\nmain.main\n\t/workspace/main.go:173\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"}

What version did you use?

amazon/cloudwatch-agent:1.0.3

What I find concerning is that there is no reference to the Dockerhub release in this repository.

What config did you use?

See above.

Environment

Fargate on ECS.

Additional context

We've resolved this by pinning the version to amazon/cloudwatch-agent:1.300033.0b462

maazamalik commented 8 months ago

We've also resolved by pinning the version to amazon/cloudwatch-agent:1.300033.0b462. Thanks @tijmenb

okankoAMZ commented 7 months ago

Hi is this an ongoing issue? Have you tried with the newer versions of the agent , and does the issue proceed?

tijmenb commented 7 months ago

@okankoAMZ I've jusy tried latest/1.300034.1b536 and the error doesn't occur there. I'll close this issue, but I'm keen to understand what the underlying cause was of this!

jefchien commented 7 months ago

The Amazon CloudWatch Agent is available in Docker Hub as a container image that can be configured for side-car or daemonset deployments. During a period on February 23, 2024, a container image containing an EKS operator was uploaded in its place. The CloudWatch agent container image was subsequently updated to latest. To ensure that you receive the latest features and patches, we recommend customers not pin to a specific tag or version.