aws-observability / helm-charts

The AWS Observability Helm Charts repository contains Helm charts to provide easy mechanisms to setup the CloudWatch Agent and other collection agents to collect telemetry data such as metrics, logs and traces to send to AWS monitoring services.
Apache License 2.0
9 stars 17 forks source link

CloudWatch Agent fails to authenticate: IMDS Issues #75

Open kwangjong opened 3 months ago

kwangjong commented 3 months ago

same issue as this: https://github.com/aws/amazon-cloudwatch-agent/issues/1101

I solved it by modifying /cloudwatch-agent-daemonset.yaml like this:

apiVersion: cloudwatch.aws.amazon.com/v1alpha1
kind: AmazonCloudWatchAgent
metadata:
  name: {{ template "cloudwatch-agent.name" . }}
  namespace: {{ .Release.Namespace }}
spec:
+ hostNetwork: true
  image: {{ template "cloudwatch-agent.image" . }}
  mode: daemonset
  ...
  env:
+ - name: RUN_WITH_IRSA
+   value: "True"
  - name: K8S_NODE_NAME
    valueFrom:
      fieldRef:
        fieldPath: spec.nodeName
  ...

And, configured Gatekeeper to restrict the hostnetwork access exclusive to cloudwatch agent. As this doc recommends to block IMDS access from unwanted pods: https://docs.aws.amazon.com/whitepapers/latest/security-practices-multi-tenant-saas-applications-eks/restrict-the-use-of-host-networking-and-block-access-to-instance-metadata-service.html

But, there needs to be more robust and permanent solution to address this issue.

lisguo commented 3 months ago

Hello, we are aware of the issue. We are evaluating a solution where we run the cloudwatch agent pod with hostNetwork: true to resolve the hop limit restriction.

Just to clarify, you need both hostNetwork: true AND RUN_WITH_IRSA set to true as an environment variable?

kwangjong commented 3 months ago

yes. without setting RUN_WITH_IRSA to True, the pod attempted to authenticate using /root/.aws/credentials in my case.

lisguo commented 3 months ago

Can you clarify what your cluster setup looks like? Are you using EKS? Native K8s on EC2?

dbcelm commented 3 months ago

Using EKS 1.29 with BottleRocketOS AMI nodes [IMDSv2 with hop-limit:2] and facing credentials not found issue within cloudwatch-agents pods, fluentbit works fine though after annotating "cloudwatch-agent" SA with IRSA that both fluentbit and cloudwatch-agent daemonsets share.

Adding env values mentioned by @kwangjong made permissions work. Also as of now there is no way to add annotations to "cloudwatch-agent" SA from helm values file, can that be added?

Also "hostnetwork" parameter will be required for custom CNI use cases. In my case, I am using CiliumCNI and hence "hostNetwork: true" was required for agent to work

kwangjong commented 3 months ago

Can you clarify what your cluster setup looks like? Are you using EKS? Native K8s on EC2?

I am using EKS 1.3

jamesking-github commented 4 weeks ago

I am using EKS 1.30 and seeing this issue also.

mtavaresmedeiros commented 1 day ago

Any update on it? @lisguo @dbcelm, how did you pass the annotations to the service account?