DataDog / datadog-agent

Main repository for Datadog Agent
https://docs.datadoghq.com/
Apache License 2.0
2.83k stars 1.19k forks source link

Metrics without ECS tags #8551

Open boh-dan opened 3 years ago

boh-dan commented 3 years ago

The issue happens during the redeploy when old infrastructure is replaced with a new one. We see a spike in metric that don’t have ECS tags attached which means that they are basically of no use. Although it is a small portion of the metrics that we send overall.

This issue happens along with following error in our log system: (pkg/tagger/collectors/ecs_extract.go:49 in parseTasks) | container handler func failed: Unable to get resource tags for container : unable to initialize client for metadata v3 API: "docker container " not found. Our infrastructure is using datadog/agent-dev:clamoriniere-7-29-x-high-cpu-py3-jmx agent image. We am suspecting that during the redeploy aws introspection returns docker container with empty id.

danbf commented 3 years ago

@boh-dan this might not be your problem, but we found that during deploys any tags that were gained via auto-discovery would have a drop-out. basically the metrics would come in, but without the normal tagging due to auto-discovery ramping up. our solution was to setup the tags in the datadog config on startup rather then use auto-discovery. we were running the datadog agent locally on the machines at that point, not via a daemonset, but it probably applies to a daemonset datadog agent and auto-discovery.

on our ecs ec2 nodes we set it via the user-data scripts with a line like this:

echo "tags: ecs_cluster:${cluster}, role:ecs" >>/etc/dd-agent/datadog.conf

this shows our tagged requests before and after hard coding the tags:

Screen Shot 2021-07-12 at 5 00 30 PM
kaitlavs commented 3 years ago

Hey @boh-dan. Can you please open a support ticket with our support team so we can investigate this further? We would need a flare from the agent when this behavior is occurring. Feel free to reference this github issue in the ticket.