aws-samples / amazon-cloudwatch-container-insights

CloudWatch Agent Dockerfile and K8s YAML templates for CloudWatch Container Insights.
MIT No Attribution
162 stars 106 forks source link

Enable EMF by default for the K8s CloudWatch Agent Operator #173

Open lucabrunox opened 3 months ago

lucabrunox commented 3 months ago

Description of the issue

Previously without the operator a user would have needed to expose the EMF ports on the node which could be conflicting, but now that the operator has a Service it's ok to enable the EMF port by default because it won't conflict with other agent installations.

The benefit is that now a user can install Container Insights, and use EMF straightaway out-of-the-box without additional configuration changes.

For this to be useful, this PR needs to be released first: https://github.com/aws/amazon-cloudwatch-agent-operator/pull/182

Description of changes

The change is only for the operator to enable EMF by default, which is useful because it exposes a Service. The other yaml configurations without Service are not as useful because the user would need to specify an IP, or create a Service manually anyway.

License

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Tests

I've tested this manually locally, the pods are Running and emitting EMF metrics:

NAMESPACE           NAME                                                                  READY   STATUS    RESTARTS   AGE
amazon-cloudwatch   pod/amazon-cloudwatch-observability-controller-manager-7bc76882bdzc   1/1     Running   0          129m
amazon-cloudwatch   pod/cloudwatch-agent-46mrx                                            1/1     Running   0          129m
amazon-cloudwatch   pod/cloudwatch-agent-tn78q                                            1/1     Running   0          129m

NAMESPACE           NAME                                                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                          AGE
amazon-cloudwatch   service/amazon-cloudwatch-observability-webhook-service   ClusterIP   10.100.199.104   <none>        443/TCP                                          129m
amazon-cloudwatch   service/cloudwatch-agent                                  ClusterIP   10.100.93.186    <none>        4315/TCP,4316/TCP,2000/TCP,25888/TCP,25888/UDP   129m
amazon-cloudwatch   service/cloudwatch-agent-headless                         ClusterIP   None             <none>        25888/TCP,25888/UDP,4315/TCP,4316/TCP,2000/TCP   129m

Also this is the JSON parsed from the new configmap:

$ echo '"{\"agent\":{\"region\":\"{{region_name}}\"},\"logs\":{\"metrics_collected\":{\"emf\":{},\"kubernetes\":{\"cluster_name\":\"{{cluster_name}}\",\"enhanced_container_insights\":true}}}}"'|jq -r .|jq
{
  "agent": {
    "region": "{{region_name}}"
  },
  "logs": {
    "metrics_collected": {
      "emf": {},
      "kubernetes": {
        "cluster_name": "{{cluster_name}}",
        "enhanced_container_insights": true
      }
    }
  }
}

Also tested by installing another CWA operator in another namespace, and it's running without having port conflicts:

NAMESPACE            NAME                                                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                          AGE
amazon-cloudwatch    service/amazon-cloudwatch-observability-webhook-service   ClusterIP   10.100.199.104   <none>        443/TCP                                          157m
amazon-cloudwatch    service/cloudwatch-agent                                  ClusterIP   10.100.93.186    <none>        4315/TCP,4316/TCP,2000/TCP,25888/TCP,25888/UDP   157m
amazon-cloudwatch    service/cloudwatch-agent-headless                         ClusterIP   None             <none>        4316/TCP,2000/TCP,25888/TCP,25888/UDP,4315/TCP   157m
amazon-cloudwatch    service/cloudwatch-agent-monitoring                       ClusterIP   10.100.172.96    <none>        8888/TCP                                         157m
amazon-cloudwatch    service/cloudwatch-agent-windows                          ClusterIP   10.100.136.143   <none>        4315/TCP,4316/TCP,2000/TCP,25888/TCP,25888/UDP   157m
amazon-cloudwatch    service/cloudwatch-agent-windows-headless                 ClusterIP   None             <none>        4315/TCP,4316/TCP,2000/TCP,25888/TCP,25888/UDP   157m
amazon-cloudwatch    service/cloudwatch-agent-windows-monitoring               ClusterIP   10.100.199.144   <none>        8888/TCP                                         157m
amazon-cloudwatch    service/dcgm-exporter-service                             ClusterIP   10.100.235.97    <none>        9400/TCP                                         157m
amazon-cloudwatch    service/neuron-monitor-service                            ClusterIP   10.100.161.74    <none>        8000/TCP                                         157m
amazon-cloudwatch2   service/amazon-cloudwatch-observability-webhook-service   ClusterIP   10.100.99.21     <none>        443/TCP                                          30s
amazon-cloudwatch2   service/cloudwatch-agent                                  ClusterIP   10.100.225.115   <none>        4315/TCP,4316/TCP,2000/TCP,25888/TCP,25888/UDP   22s
amazon-cloudwatch2   service/cloudwatch-agent-headless                         ClusterIP   None             <none>        4315/TCP,4316/TCP,2000/TCP,25888/TCP,25888/UDP   22s
amazon-cloudwatch2   service/cloudwatch-agent-monitoring                       ClusterIP   10.100.115.32    <none>        8888/TCP                                         22s
amazon-cloudwatch2   service/cloudwatch-agent-windows                          ClusterIP   10.100.235.74    <none>        4315/TCP,4316/TCP,2000/TCP,25888/TCP,25888/UDP   22s
amazon-cloudwatch2   service/cloudwatch-agent-windows-headless                 ClusterIP   None             <none>        4315/TCP,4316/TCP,2000/TCP,25888/TCP,25888/UDP   22s
amazon-cloudwatch2   service/cloudwatch-agent-windows-monitoring               ClusterIP   10.100.116.13    <none>        8888/TCP                                         22s
amazon-cloudwatch2   service/dcgm-exporter-service                             ClusterIP   10.100.220.65    <none>        9400/TCP                                         22s
amazon-cloudwatch2   service/neuron-monitor-service                            ClusterIP   10.100.127.201   <none>        8000/TCP                                         22s

Requirements

Before committing the code, please verify the following:

Yes it's changing existing sample configurations, however it does not impact existing customer behavior.