Open schewara opened 5 months ago
Pinging code owners:
receiver/prometheus: @Aneurysm9 @dashpole
See Adding Labels via Comments if you do not have permissions to add labels yourself.
I'm not sure I fully understand your issue, but it's seemingly not to do with the docker stats receiver. It just seems like the prometheus exporter isn't exporting what you expect, or there is some misunderstanding about what it makes available.
I think this would be more helpful if you identified one component that wasn't operating as expected. The docker stats receiver and the prometheus exporter have nothing to do with each other.
If your problem is that the docker stats receiver isn't reporting a metric that it should be, then it's a problem with the docker stats. If the prom exporter isn't doing what you think it should, that's a problem with the prom exporter (or a misconfiguration).
From what I can see it seems that the docker stats receiver is producing all of the information it should, and then the prom exporter is stripping some of the information that you expect. You can verify this by replacing the prom exporter with the debugexporter
and see the output straight into stdout. If it's what you expect then you can narrow down the issue to the prom exporter.
I'm experiencing the same issue, but my exporter is awsemf
.
Using the debugger, I can see the labels in Resource Attributes
, but it seems awsemf is just using the Data Point Attributes
when sending the metrics, and it ignores what's in the Resource attributes?
ResourceMetrics #4
Resource SchemaURL: https://opentelemetry.io/schemas/1.6.1
Resource attributes:
-> container.runtime: Str(docker)
-> container.hostname: Str(c3e61d730cb6)
-> container.id: Str(c3e61d730cb6c5936b5862844d6e4acf60a880821610a7af9f9a689cffb966db)
-> container.image.name: Str(couchdb:2.3.1@sha256:5c83dab4f1994ee4bb9529e9b1d282406054a1f4ad957d80df9e1624bdfb35d7)
-> container.name: Str(swarmpit_db.1.usj3zlnoxmwjhjc27tc3g5he0)
-> swarm_service: Str(swarmpit_db)
-> swarm_container_id: Str(usj3zlnoxmwjhjc27tc3g5he0)
-> swarm_namespace: Str(swarmpit)
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope otelcol/dockerstatsreceiver 1.0.0
Metric #0
Descriptor:
-> Name: container.blockio.io_service_bytes_recursive
-> Description: Number of bytes transferred to/from the disk by the group and descendant groups.
-> Unit: By
-> DataType: Sum
-> IsMonotonic: true
-> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
-> device_major: Str(259)
-> device_minor: Str(0)
-> operation: Str(read)
StartTimestamp: 2024-06-20 19:10:25.725911895 +0000 UTC
Timestamp: 2024-06-20 19:19:28.761889055 +0000 UTC
Value: 4366336
NumberDataPoints #1
Data point attributes:
-> device_major: Str(259)
-> device_minor: Str(0)
-> operation: Str(write)
StartTimestamp: 2024-06-20 19:10:25.725911895 +0000 UTC
Timestamp: 2024-06-20 19:19:28.761889055 +0000 UTC
Value: 4096
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Component(s)
exporter/prometheus, receiver/dockerstats, receiver/prometheus
What happened?
Description
We have a collector running (in docker), which is supposed to collect
receiver/dockerstats
receiver/prometheus
anddocker_sd_config
exporter/prometheusexporter
A similar issue was already reported but was closed without any real solution -> #21247
Steps to Reproduce
prometheus/node-exporter
otel/opentelemetry-collector-contrib
container/metrics
endpoint of the prometheus exporterExpected Result
Individual metrics for each container running on the same host.
Actual Result
Only metrics which have Data point attributes are shown like the following, plus the metrics coming from the prometheus receiver.
Test scenarios and observations
exporter/prometheus
-resource_to_telemetry_conversion
- enabledWhen enabling the config options, the following was observed
receiver/dockerstats
metrics are available as expectedreceiver/prometheus
metrics are goneI don't really know how the prometheus receiver converts the scraped metrics into an otel object, but it looks like that it creates individual metrics + a
target_info
metric only containing Data point attributes but no Resource attributes.This would explain, that the metrics disappear, as from what it seems, all existing metric labels are wiped and replaced with nothing.
manually setting attribute labels
Trying to set manual static attributes through the attributes processor only added a new label, to the single metrics, but did not produce individual container metrics
After going through all the logs and searching through all the documentation I discovered the Setting resource attributes as metric labels section from the prometheus exporter, when implemented (see the commented out sections of the config), metrics from the dockerstats receiver showed up on the exporters
/metrics
endpoint, but are still missing some crucial labels, which might need to be added manually as well.Findings
Based on all the observations during testing and trying things out, these are my takeaways for the current shortcomings of the 3 selected components and how they are not very good integrated with each other.
receiver/dockerstats
resource
ordatapoint
attributecontainer_labels_to_metric_labels
andenv_vars_to_metric_labels
settings is incorrect, as they are not added as a datapoint attribute and therefore never show up in any metric labelsjob
and aninstance
label, by using theservice.namespace,service.name,service.instance.id
resource attributes, which then hopefully get picked up correctly by the exporter to convert it into the right label.receiver/prometheus
docker_sd_configs
are added as resource attributes to the scraped metrics. But as I can't find the link to the source right now I am either mistaken or it just is not the case, looking at the log outputs and thetarget_info
metrics.exporter/prometheusexporter
target_info
metric, I am missing the resource attributes from the dockerstats metrics. Maybe this is due to the missing service attributes or some other reason, but I was unable to see any errors or warnings in the standard logresource_to_telemetry_conversion
functionality left me a bit speechless, that it wipes all datapoint attributes, especially when there are no resource attributes available. Also activating it would mean, that I would loose (as an example) theinterface
information from thecontainer.network.io.usage.rx_bytes
metric, without any idea from where the actual value is taken or calculated from. A warning in the documentation would be really helpful, or a flag to adjust the behavior based on individual needs.Right now I am torn between manually transform all the labels of the dockerstats receiver, or create duplicate pipelines with a duplicated exporter, but either way there is some room for improvement to have everything working together smoothly.
Collector version
otel/opentelemetry-collector-contrib:0.101.0
Environment information
Environment
Docker
OpenTelemetry Collector configuration
Log output
receiver/dockerstats
metric with a Data point attribute, but no Resource attributereceiver/dockerstats
metric with no Data point attribute, but Resource attributes