Closed diranged closed 4 days ago
This is a duplicate of https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/32828 - but opened in the operator project in case this is an operator/allocator issue.
@diranged either myself or @swiatekm-sumo hopes to take a look at this next week, we have other priorities currently with getting the releases out.
we've both had difficult weeks and haven't gotten to this yet, still on our list!!
Thanks @jaronoff97 - no worried... I appreciate you looking at it when you have time!
Finally got around to investigating this, thanks a lot for your patience. In essence, this is working as intended. This only happens for the opentelemetry_allocator_targets
metric, because job_name
is simply a label on this metric - each time series shows how many targets were allocated for a given job. Here is what the raw output from the Prometheus endpoint looks like:
# HELP opentelemetry_allocator_targets Number of targets discovered.
# TYPE opentelemetry_allocator_targets gauge
opentelemetry_allocator_targets{job_name="serviceMonitor/otel/otel-collector-collectors/0"} 2
opentelemetry_allocator_targets{job_name="serviceMonitor/otel/otel-collector-target-allocators/0"} 2
In this case, job_name
refers to the job the metric is about. If you want the job that scraped it, you should look at the service.name
resource attribute.
Thank you for digging in - I will go back and re-test my setup and see if this aligns with what we're seeing.. It'll be a few weeks, I'm out on vacation right now. Thanks again though!
@diranged i'm going to close this for now, feel free to reopen if this is still an issue!
Component(s)
collector
What happened?
Component(s)
receiver/prometheus
What happened?
Description
We're setting up a single collector pod in a StatefulSet configured to monitor "all the rest" of our OTEL components ... this collector is called the
otel-collector-cluster-agent
, and it uses the TargetAllocator to monitor forServiceMonitor
jobs that have specific labels. We currently have two differentServiceMonitors
- one for collectingotelcol.*
metrics from the collectors, and one for collectingopentelemtry_.*
metrics from the Target Allocators.We are seeing metrics reported from the TargetAllocator pods duplicated into
DataPoints
that refer to the right, and wrong ServiceMonitor:When we delete the
otel-collector-collectors
ServiceMonitor, the behavior does not change... which is wild... however, if we delete the entire stack and namespace and then re-create it without the second ServiceMonitor, then the data is correct... until we create the second Service Monitor, then it goes bad again.Steps to Reproduce
This creates a
/scrape_configs
that looks like this:Expected Result
We should see datapoints for
opentelemetry_.*
metrics that only come from the target allocator pods and are attributed once .. meaning, oneDataPoint
per target pod, and that's it:Collector version
0.98.0
Environment information
Environment
OS: BottleRocket 1.19.2
OpenTelemetry Collector configuration
No response
Log output
No response
Additional context
No response
Kubernetes Version
1.28
Operator version
0.98.0
Collector version
0.98.0
Environment information
Environment
OS: (e.g., "Ubuntu 20.04") Compiler(if manually compiled): (e.g., "go 14.2")
Log output
No response
Additional context
No response