Open spacez320 opened 4 years ago
Hello, @spacez320 I faced a similar issue, I wanted to discard some tags to avoid duplication of metrics that represent the same logical value (I wanted to get rid of tag host
, because the metric was service level, but was reported by every replica). Distributions
metric type helps me because it allows select tags to retain. You can read more https://docs.datadoghq.com/metrics/distributions/
But it's still not an ideal solution for everything, for instance how to have a Gauge metric that ignores some tags?
We have same problem with kubernetes.memory.request
is doubled for each new value of container_id
(for example if container is restarted). It takes way too long for agent to remove old container_id tag, and during that period, the sum metric is doubled
What we need is ability to remove tag from consideration completely
Describe what happened:
We've seen it be the case that when we adjust Tag cardinality (adding a new Tag, for example), two metric series which logically represent the same data can coexist for a short period of time (probably the length of time to collect new data) and cause counting issues in Monitors or Dashboards.
For example, I have some metric in the Kubernetes integration
kubernetes.node.whatever
, and I want to add a newnodeLabelAsTag
settingfoo=bar
. I push that configuration change to the Agent, and the following happens:Before push, metrics look like this:
After push, they look like this for a few minutes:
Then eventually they become:
The problem with this is that anything that tries to
sum:kubernetes.node.whatever{fizz=buzz}
will temporarily see "2" when really it should be "1".Describe what you expected:
I'm not sure what's possible, but it would be nice if we could teach Monitor or Datadog metrics to deduplicate accessory cardinality. Currently it's the case that changes like this can cause our Monitors to go haywire, and we would like to edit Tags while avoiding this.
Steps to reproduce the issue:
Additional environment details (Operating System, Cloud provider, etc):
Seems like a pretty general problem.