aws / amazon-cloudwatch-agent

CloudWatch Agent enables you to collect and export host-level metrics and logs on instances running Linux or Windows server.
MIT License
452 stars 205 forks source link

Metadata error logs in Cloudwatch agent - 404 EC2MetadataError: failed to make EC2Metadata request #1435

Open lesterxy opened 1 week ago

lesterxy commented 1 week ago

Describe the bug

cloudwatch agent logs throw repeated errors. status code: 404, request id: D! should retry true for imds error : EC2MetadataError: failed to make EC2Metadata request

This only happened on v.1300049. On v.1.300048.1b904, agent don't have error logs.

Steps to reproduce

  1. Create an Ubuntu 22.04.5 LTS EC2 instance.

  2. Create and attach Role to EC2 with "CloudWatchAgentServerPolicy" managed policy.

  3. Install the cloudwatch agent using the documentation (root user): 3.1. wget https://amazoncloudwatch-agent.s3.amazonaws.com/ubuntu/amd64/latest/amazon-cloudwatch-agent.deb 3.3. dpkg -i -E ./amazon-cloudwatch-agent.deb 3.4. /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard 3.5. /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json

  4. Check the agent status: systemctl status amazon-cloudwatch-agent

What did you expect to see? No errors on the status logs.

What did you see instead?

● amazon-cloudwatch-agent.service - Amazon CloudWatch Agent
     Loaded: loaded (/etc/systemd/system/amazon-cloudwatch-agent.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2024-11-20 09:34:55 UTC; 55s ago
   Main PID: 1456 (amazon-cloudwat)
      Tasks: 7 (limit: 1078)
     Memory: 21.0M
        CPU: 389ms
     CGroup: /system.slice/amazon-cloudwatch-agent.service
             └─1456 /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -config /opt/aws/amazon-cloudwatch-agent/etc/amazon>

Nov 20 09:34:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
Nov 20 09:34:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:                  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transi>
Nov 20 09:34:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]: <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang=">
Nov 20 09:34:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:  <head>
Nov 20 09:34:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:   <title>404 - Not Found</title>
Nov 20 09:34:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:  </head>
Nov 20 09:34:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:  <body>
Nov 20 09:34:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:   <h1>404 - Not Found</h1>
Nov 20 09:34:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:  </body>
Nov 20 09:34:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]: </html>

from journalctl, this log keeps repeating

Nov 20 09:39:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:         status code: 404, request id: D! should retry true for imds error : EC2MetadataError: failed to make EC2Metadata request
Nov 20 09:39:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]: <?xml version="1.0" encoding="iso-8859-1"?>
Nov 20 09:39:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
Nov 20 09:39:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:                  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
Nov 20 09:39:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]: <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
Nov 20 09:39:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:  <head>
Nov 20 09:39:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:   <title>404 - Not Found</title>
Nov 20 09:39:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:  </head>
Nov 20 09:39:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:  <body>
Nov 20 09:39:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:   <h1>404 - Not Found</h1>
Nov 20 09:39:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]:  </body>
Nov 20 09:39:55 ip-172-31-20-11 start-amazon-cloudwatch-agent[1456]: </html>

What version did you use? v1.300049.1b929

What config did you use? Reference agent config output:

{
        "agent": {
                "metrics_collection_interval": 60,
                "run_as_user": "cwagent"
        },
        "metrics": {
                "aggregation_dimensions": [
                        [
                                "InstanceId"
                        ]
                ],
                "append_dimensions": {
                        "AutoScalingGroupName": "${aws:AutoScalingGroupName}",
                        "ImageId": "${aws:ImageId}",
                        "InstanceId": "${aws:InstanceId}",
                        "InstanceType": "${aws:InstanceType}"
                },
                "metrics_collected": {
                        "disk": {
                                "measurement": [
                                        "used_percent"
                                ],
                                "metrics_collection_interval": 60,
                                "resources": [
                                        "*"
                                ]
                        },
                        "mem": {
                                "measurement": [
                                        "mem_used_percent"
                                ],
                                "metrics_collection_interval": 60
                        }
                }
        }
}

Environment OS version: Ubuntu 22.04.5 LTS

Additional context This only happened on v.1300049. On v.1.300048.1b904, agent don't have error logs.

debu99 commented 6 days ago

how to download v.1.300048.1b904 binary?

tdtm commented 1 day ago

I'm also encountering this in v1.300049.1b929.

EDIT: additional details from https://github.com/aws/amazon-cloudwatch-agent/pull/1440 indicates:

In version 1.300049.0 and above, the agent will log the above message regardless of log levels on a defined interval. This is because we recently enabled instance tags by default to retrieve ASG name and instance tag name for entity service names. This becomes an issue when instance metadata tags is not enabled which can be majority case since instance metadata tags is an opt-in feature. The issue is especially apparent in EKS since EKS does not support instance metadata tags

Instance metadata tags instructions: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/work-with-tags-in-IMDS.html#allow-access-to-tags-in-IMDS