aws / amazon-cloudwatch-agent

CloudWatch Agent enables you to collect and export host-level metrics and logs on instances running Linux or Windows server.
MIT License
443 stars 201 forks source link

Agent fails to detect EKS cluster created in Access Entries only authentication mode #1249

Open ArthurMelin opened 3 months ago

ArthurMelin commented 3 months ago

Describe the bug The agent fails to detect that it is running in a EKS cluster if the cluster has been configured with the accessConfig.authenticationMode = "API" (without a kube-system/aws-auth ConfigMap).

Please note that the aws-auth authentication mode has been marked as deprecated in the EKS documentation (see https://docs.aws.amazon.com/eks/latest/userguide/auth-configmap.html).

Steps to reproduce

  1. Create a new EKS cluster as usual but with Cluster access > Authentication mode set to EKS API: image
  2. Add a managed node group
  3. Install the Cloudwatch Observability EKS addon

What did you expect to see? The agent should begin send metrics to Cloudwatch.

What did you see instead? The agent fails to send metrics because it doesn't detect that it's running on EKS (because the aws-auth ConfigMap doesn't exist) and is misconfigured (it fails with an error after attempting to read credentials from /root/.aws/credentials which doesn't exist).

What version did you use? EKS version: 1.30 EKS addon version: v1.8.0-eksbuild.1 Agent Version: 1.300041.0b681

What config did you use? Config: (default addon config)

 {"agent":{"region":"eu-west-3"},"logs":{"metrics_collected":{"application_signals":{"hosted_in":"main"},"kubernetes":{"cluster_name":"main","enhanced_container_insights":true}}},"traces":{"traces_collected":{"application_signals":{}}}}

Environment OS: Amazon Linux 2023 EKS optimized 1.30.0-20240703

Additional context Add any other context about the problem here.

okankoAMZ commented 3 months ago

Hello,

Thank you for reaching out. To assist with troubleshooting this issue, could you please follow these steps:

  1. Update your CloudWatch Agent configuration to enable debug mode by modifying the JSON as follows:
{
    "agent": {
        "debug": true,
        "region": "eu-west-3"
    },
    "logs": {
        "metrics_collected": {
            "application_signals": {
                "hosted_in": "main"
            },
            "kubernetes": {
                "cluster_name": "main",
                "enhanced_container_insights": true
            }
        }
    },
    "traces": {
        "traces_collected": {
            "application_signals": {}
        }
    }
}
  1. After updating the configuration, please provide the CloudWatch Agent error logs. This will help me better understand the issue you're experiencing.

If you need any assistance gathering the agent logs, please let me know, and I'll be happy to guide you through the process.

In the meantime, I'll try to reproduce the issue on my end as well. Please let me know if you have any other questions or concerns.

ArthurMelin commented 3 months ago

Hi, here are the logs, hope they can help: cloudwatch-agent.log

Since opening the issue, I attempted to force the EKS detection by creating an empty aws-auth config map, but it doesn't seem to have changed much. The agent still fails to push metrics because it attempts to get credentials the wrong way.

More details for reproducing the issue: the agent was installed with the EKS Cloudwatch Observability addon with a dedicated service account role following this guide: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Observability-EKS-addon.html#install-CloudWatch-Observability-EKS-addon-serviceaccountrole

sky333999 commented 3 months ago

Hi @ArthurMelin Thanks for providing the logs. To rule out any issue with your setup itself, could you please kubectl describe your service account and provide us the output? I'd like to check the annotations on it. Also knowing the exact steps/commands you ran to reproduce this issue on our end would be helpful.

ArthurMelin commented 3 months ago

Here is the service account, role and policy attachment info:

$ kubectl describe serviceaccount -n amazon-cloudwatch cloudwatch-agent
Name:                cloudwatch-agent
Namespace:           amazon-cloudwatch
Labels:              <none>
Annotations:         eks.amazonaws.com/role-arn: arn:aws:iam::637423337567:role/AmazonEKSCloudwatchAgent
Image pull secrets:  <none>
Mountable secrets:   <none>
Tokens:              <none>
Events:              <none>

$ aws iam get-role --role-name AmazonEKSCloudwatchAgent
{
    "Role": {
        "Path": "/",
        "RoleName": "AmazonEKSCloudwatchAgent",
        "RoleId": "AROAZI2LEIBPWQXOTYLZW",
        "Arn": "arn:aws:iam::637423337567:role/AmazonEKSCloudwatchAgent",
        "CreateDate": "2024-07-12T14:55:33+00:00",
        "AssumeRolePolicyDocument": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "Federated": "arn:aws:iam::637423337567:oidc-provider/oidc.eks.eu-west-3.amazonaws.com/id/39E7C30FCAA1514DAA747C966D8A710F"
                    },
                    "Action": "sts:AssumeRoleWithWebIdentity",
                    "Condition": {
                        "StringEquals": {
                            "oidc.eks.eu-west-3.amazonaws.com/id/39E7C30FCAA1514DAA747C966D8A710F:sub": "system:serviceaccount:amazon-cloudwatch:cloudwatch-agent",
                            "oidc.eks.eu-west-3.amazonaws.com/id/39E7C30FCAA1514DAA747C966D8A710F:aud": "sts.amazonaws.com"
                        }
                    }
                }
            ]
        },
        "MaxSessionDuration": 3600,
        "RoleLastUsed": {
            "LastUsedDate": "2024-07-19T07:10:43+00:00",
            "Region": "eu-west-3"
        }
    }
}

$ aws iam list-attached-role-policies --role-name AmazonEKSCloudwatchAgent
{
    "AttachedPolicies": [
        {
            "PolicyName": "CloudWatchAgentServerPolicy",
            "PolicyArn": "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy"
        }
    ]
}

Regarding your question about commands, I used Terraform to deploy the EKS cluster and the cloudwatch role and addon, but it should match the setup steps described in this documentation https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Observability-EKS-addon.html#install-CloudWatch-Observability-EKS-addon-serviceaccountrole

github-actions[bot] commented 1 week ago

This issue was marked stale due to lack of activity.