open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
2.84k stars 2.23k forks source link

The trace_id/span_id of exemplar would not change when using promethuesremotewrite exporter #30830

Open toughnoah opened 6 months ago

toughnoah commented 6 months ago

Component(s)

exporter/prometheusremotewrite

What happened?

Description

when I use the spanmetrics connector, the trace_id/span_id of exemplar would not change if I use the promethuesremotewrite exporter directly write into backend prometheus/mimir

Steps to Reproduce

Simply start a demo to produce traces to otel collector

Expected Result

trace_id and span_id of the exemplar should change if I start a new trace

Actual Result

trace_id/span_id would always be the same, and the only way to is to restart otel collector 1706547360287 WX20240130-005511@2x WX20240130-005539@2x

Collector version

v0.91.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04") Compiler(if manually compiled): (e.g., "go 14.2")

kubernetes 1.25

OpenTelemetry Collector configuration

connectors:
      spanmetrics:
        histogram:
          explicit:
            buckets: [100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms]
        exemplars:
          enabled: true
          max_per_data_point: 1000
        metrics_flush_interval: 30s
    receivers:
      otlp:
        protocols:
          http:
            endpoint: :4317
          grpc:
            endpoint: :4318
    service:
      telemetry:
        logs:
          level: "debug"
      extensions: []
      pipelines:
        metrics:
          receivers: [otlp, spanmetrics]
          exporters: [prometheusremotewrite]
        traces:
          receivers: [otlp]
          exporters: [spanmetrics]
    exporters:
      prometheusremotewrite: # the PRW exporter, to ingest metrics to backend
        endpoint: http://mimir-distributed-nginx.grafana/api/v1/push
        remote_write_queue:
          enabled: false
        resource_to_telemetry_conversion:
          enabled: true

Log output

No response

Additional context

No response

github-actions[bot] commented 6 months ago

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 6 months ago

Pinging code owners for connector/spanmetrics: @portertech. See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 4 months ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

ankitpatel96 commented 3 months ago

Hi, I have been unable to reproduce this issue locally. I've configured the collector similarly to you, and I'm running prometheus locally on docker with the default configuration (plus flags to enable exemplar storage and remote write). To generate traces I'm running telemetrygen traces --otlp-insecure --duration 5m --rate 4.

When I look at prometheus, each of the exemplars have different spans and trace ids. Do you have any more details on your setup, or can you try using something like telemetrygen to generate your traces to rule out your data?

toughnoah commented 3 months ago

@ankitpatel96 Hi, I am using v0.91.0 otel-collector, and I have a service that generates traces and span. I have fixed it locally because I debugged into to pkg/translator/prometheusremotewrite/helper.go getPromExemplars method just appends new exemplar to the slice, but promethues will always pick up the first exemplar of the slice.

ankitpatel96 commented 3 months ago

Thanks for the response! Can you clarify the problem? Is span metrics working as expected? Is the problem with prometheus or in the prometheusremotewrite exporter? I glanced through the prometheus codebase and at a first glance (I really could be wrong) prometheus does read through all the exemplars. https://github.com/prometheus/prometheus/blob/d699dc3c7706944aafa56682ede765398f925ef0/storage/remote/write_handler.go#L140-L147

toughnoah commented 3 months ago

@ankitpatel96 Hi, it is the prometheusremotewrite exporter. Prometheus exporter works fine. Yes, you can see my screen shots that span metrics works fine but with prometheusremotewrite exporter, the mimir/prometheus can only picked up the first element in the exemplars slice no matter how many times I start new traces. I try to modify the source code, that only pickup the last exemplar. Then it get solved

ankitpatel96 commented 3 months ago

Have you filed a ticket with mimir? Sounds like a bug in their system rather than the prometheus remotewrite exporter?

github-actions[bot] commented 1 month ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.