open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.02k stars 2.33k forks source link

[exporter/datadog] Missing histogram metrics in datadog payload #24377

Closed karmingc closed 1 year ago

karmingc commented 1 year ago

Component(s)

exporter/datadog

What happened?

Description

Hi, I am currently using Sprint Boot v3.1.1 and it is using Micrometer to send metrics to OpenTelemetry Collector. However, I am having issue debugging missing metrics of type histogram as they seem to be missing from the datadog payload step.

I'd like to get some pointers what's the best way to debug this.

spring boot config:

# spring boot application.yml
management:
  endpoints:
    web:
      exposure:
        include: "*"
        exclude: env,beans
  server:
    port: 8081
  otlp:
    metrics:
      export:
        enabled: true
        url: http://localhost:4318/v1/metrics
        aggregationTemporality: "delta"

Steps to Reproduce

  1. Create a Spring Boot application, that uses otlp registry https://micrometer.io/docs/registry/otlp
  2. Run OpenTelemetry collector locally with Datadog as exporter

Expected Result

all histogram metrics should be part of the datadog payload step

Actual Result

all histogram metrics are excluded from the datadog payload step

Collector version

v0.81

Environment information

Environment

using the Docker image otel/opentelemetry-collector-contrib

OpenTelemetry Collector configuration

receivers:
  otlp:
    protocols:
      grpc:
      http:

exporters:
  datadog:
    api:
      site: datadoghq.com
      key: ${env:DD_API_KEY}
    metrics:
      histograms:
        mode: distributions
  logging:
    verbosity: detailed

processors:
  batch:

extensions:
  health_check:
  pprof:
  zpages:

service:
  extensions: [pprof, zpages, health_check]
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [logging, datadog]
  telemetry:
    logs:
      level: "debug"

Log output

otel-otel-collector-1  | Metric #91
otel-otel-collector-1  | Descriptor:
otel-otel-collector-1  |      -> Name: jvm.gc.pause
otel-otel-collector-1  |      -> Description: Time spent in GC pause
otel-otel-collector-1  |      -> Unit: milliseconds
otel-otel-collector-1  |      -> DataType: Histogram
otel-otel-collector-1  |      -> AggregationTemporality: Delta
otel-otel-collector-1  | HistogramDataPoints #0
otel-otel-collector-1  | Data point attributes:
otel-otel-collector-1  |      -> action: Str(end of minor GC)
otel-otel-collector-1  |      -> cause: Str(G1 Evacuation Pause)
otel-otel-collector-1  |      -> env: Str(local-kc-test)
otel-otel-collector-1  |      -> gc: Str(G1 Young Generation)
otel-otel-collector-1  | StartTimestamp: 2023-07-18 21:33:00 +0000 UTC
otel-otel-collector-1  | Timestamp: 2023-07-18 21:34:00 +0000 UTC
otel-otel-collector-1  | Count: 1
otel-otel-collector-1  | Sum: 2.000000
otel-otel-collector-1  | Max: 2.000000

a few minutes later

otel-otel-collector-1  | Metric #91
otel-otel-collector-1  | Descriptor:
otel-otel-collector-1  |      -> Name: jvm.gc.pause
otel-otel-collector-1  |      -> Description: Time spent in GC pause
otel-otel-collector-1  |      -> Unit: milliseconds
otel-otel-collector-1  |      -> DataType: Histogram
otel-otel-collector-1  |      -> AggregationTemporality: Delta
otel-otel-collector-1  | HistogramDataPoints #0
otel-otel-collector-1  | Data point attributes:
otel-otel-collector-1  |      -> action: Str(end of minor GC)
otel-otel-collector-1  |      -> cause: Str(G1 Evacuation Pause)
otel-otel-collector-1  |      -> env: Str(local-kc-test)
otel-otel-collector-1  |      -> gc: Str(G1 Young Generation)
otel-otel-collector-1  | StartTimestamp: 2023-07-18 21:37:00 +0000 UTC
otel-otel-collector-1  | Timestamp: 2023-07-18 21:38:00 +0000 UTC
otel-otel-collector-1  | Count: 1
otel-otel-collector-1  | Sum: 2.000000
otel-otel-collector-1  | Max: 2.000000

Additional context

Within these logs, it seems like the same metric comes back later, but always with a HistogramDataPoints #0

github-actions[bot] commented 1 year ago

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

songy23 commented 1 year ago

Hi @karmingc it looks like there's no bucket in your histogram metric? If that's the case you might want to enable config send_aggregation_metrics so that the sum, count, min and max metrics are exported. E.g.

exporters:
  datadog:
    api:
      site: datadoghq.com
      key: ${env:DD_API_KEY}
    metrics:
      histograms:
        mode: distributions
        send_aggregation_metrics: true
karmingc commented 1 year ago

Hi @karmingc it looks like there's no bucket in your histogram metric? If that's the case you might want to enable config send_aggregation_metrics so that the sum, count, min and max metrics are exported. E.g.

exporters:
  datadog:
    api:
      site: datadoghq.com
      key: ${env:DD_API_KEY}
    metrics:
      histograms:
        mode: distributions
        send_aggregation_metrics: true

Hi @songy23, I think there might be some misconfiguration within the instrumented histograms themselves that lead to missing buckets. So I don't think it's an issue with the exporter. Thank you for your suggestion as well!