open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.9k stars 2.27k forks source link

[exporter/prometheusremotewrite] Duplicate attributes when using resource_to_telemetry_conversion #34909

Open robertcoltheart opened 2 weeks ago

robertcoltheart commented 2 weeks ago

Component(s)

exporter/prometheusremotewrite

What happened?

Description

When scraping metrics target with a target_info and enabling resource_to_telemetry_conversion on the Prom remote write exporter, some attributes are duplicated and concatenated with ;.

Steps to Reproduce

Scrape target producing the following:

# TYPE target info
# HELP target Target metadata
target_info{telemetry_sdk_name="opentelemetry",telemetry_sdk_language="dotnet",telemetry_sdk_version="1.9.0",deployment_environment="qa",host_name="my-service-7586fcbf74-stwjn",service_name="my-service",service_namespace="my-namespace",service_version="1.10.0",service_instance_id="b61916c5-f387-4f82-a963-9a2220b7121c"} 1

Expected Result

No duplicate values in the final output in Grafana / Loki.

Actual Result

Getting duplicate values in metric attributes concatenated with ';' in deployment_environment, service_instance_id and service_name when this appears in Loki/Grafana. For example deployment_environment="qa;qa".

myservice_load_info {
  container="my-service",
  deployment_environment="qa;qa",
  endpoint="default",
  host_name="my-service-7586fcbf74-stwjn",
  http_scheme="http",
  instance="100.110.191.137:8080",
  job="my-service",
  namespace="ns",
  net_host_name="100.110.191.137",
  net_host_port="8080",
  pod="my-service-569c75b9c4-98gbq",
  server_address="100.110.191.137",
  server_port="8080",
  service="my-service",
  service_instance_id="100.110.191.137:8080;b61916c5-f387-4f82-a963-9a2220b7121c",
  service_name="my-service;my-service",
  service_namespace="my-namespace",
  service_version="1.10.0",
  status="Current",
  telemetry_sdk_language="dotnet",
  telemetry_sdk_name="opentelemetry",
  telemetry_sdk_version="1.9.0",
  url_scheme="http"
}

Collector version

0.104.0

Environment information

Environment

OS: Linux / Kubernetes

OpenTelemetry Collector configuration

targetAllocator:
  enabled: true
  serviceAccount: opentelemetry-collector
  prometheusCR:
    enabled: true
    serviceMonitorSelector: {}
    podMonitorSelector: {}
config:
  receivers:
    prometheus:
      config:
        scrape_configs:
          - job_name: 'opentelemetry-collector'
            scrape_interval: 5s
            static_configs:
              - targets: [ 'localhost:8888' ]
  processors:
    batch: {}
    resource/metrics:
      attributes:
        - action: insert
          key: deployment.environment
          value: qa
  exporters:
    prometheusremotewrite:
      endpoint: https://our-prometheus
      resource_to_telemetry_conversion:
        enabled: true
  service:
    pipelines:
      metrics:
        receivers:
          - prometheus
        processors:
          - batch
          - resource/metrics
        exporters:
          - prometheusremotewrite

Log output

No response

Additional context

No response

github-actions[bot] commented 2 weeks ago

Pinging code owners:

dashpole commented 2 weeks ago

Notes from triage:

What behavior did you expect? Did you expect the values from target_info to remain unchanged? Or did you expect the names from your prometheus job/instance to be kept?

robertcoltheart commented 2 weeks ago

Ok, you're right, dropping the duplicate attributes cleans it up and removes the duplicates. I guess the expectation when using this feature is that the metric names are all sanitized the same way which would eliminate these duplicates. Especially since it seems that Prometheus always outputs '_' regardless of the internal metric name, and Grafana always displays underscores too.