open-telemetry / opentelemetry-collector-contrib

Contrib repository for the OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
3.02k stars 2.33k forks source link

error prometheusexporter@v0.85.0/accumulator.go:94 - failed to translate metric #26725

Closed martinrw closed 1 year ago

martinrw commented 1 year ago

Component(s)

exporter/prometheus

What happened?

Description

We are seeing the error " failed to translate metric" in our logs for prometheus exporter for some of our metrics. We are collecting metrics from the otel agent +libraries from a bunch of springboot, nodejs and python services, most of these are working perfectly fine but we see this error quite frequently in our logs.

I am unable to pinpoint these errors to a specific service or runtime or version of the agent etc even after enabling debug logs.

It is happening for a few different metrics such as: http.server.duration http.client.request.size http.client.duration

Steps to Reproduce

Expected Result

Metrics are written to the /metrics endpoint to be scanned by prometheus

Actual Result

Error message in the logs, data point is dropped

Collector version

0.85

Environment information

No response

OpenTelemetry Collector configuration

config:
  exporters:
    prometheus:
      endpoint: "0.0.0.0:9464"
      resource_to_telemetry_conversion:
        enabled: true
      enable_open_metrics: true
      metric_expiration: 3m

  extensions:
    # The health_check extension is mandatory for this chart.
    # Without the health_check extension the collector will fail the readiness and liveliness probes.
    # The health_check extension can be modified, but should never be removed.
    health_check: {}
    zpages: {}
    pprof: {}
    memory_ballast:
      size_in_percentage: 30
  processors:
    memory_limiter:
      check_interval: 1s
      limit_percentage: 50
      spike_limit_percentage: 20
    batch:
      send_batch_size: 10000
      send_batch_max_size: 11000
      timeout: 2s
    #Delete unnecessary attributes from our metrics
    resource:
      attributes:
        - key: telemetry.sdk.name
          action: delete
        - key: telemetry.sdk.version
          action: delete
        - key: telemetry.sdk.language
          action: delete
        - key: telemetry.auto.version
          action: delete
        - key: container.id
          action: delete
        - key: process.command_args
          action: delete
        - key: process.command_line
          action: delete
        - key: process.command
          action: delete
        - key: process.executable.path
          action: delete
    transform:
      error_mode: ignore
      metric_statements:
        - context: metric
          statements:
            - set(description, "The duration of the inbound HTTP request") where name == "http.server.duration"
            - set(description, "The duration of the inbound HTTP request") where name == "http.client.duration"
            - set(description, "The current number of threads having NEW state") where name == "jvm.threads.states"
            - set(description, "The number of concurrent HTTP requests that are currently in-flight") where name == "http.server.active_requests" 
            - set(description, "") where name == "http.server.requests"
            - set(description, "") where name == "http.server.requests.max"
            - set(description, "Number of log events that were enabled by the effective log level") where name == "logback.events"
            - set(description, "") where name == "spring.data.repository.invocations.max"
            - set(description, "Duration of repository invocations") where name == "spring.data.repository.invocations"
            - set(description, "Time taken for the application to be ready to service requests") where name == "application.ready.time"
            - set(description, "Time taken (ms) to start the application") where name == "application.started.time"
            - set(description, "The size of HTTP request messages") where name == "http.client.request.size_bytes"
            - set(description, "The size of HTTP response messages") where name == "http.client.response.size_bytes"
    k8sattributes:
      extract:
        metadata:
          - k8s.namespace.name
          - k8s.deployment.name
          - k8s.statefulset.name
          - k8s.daemonset.name
          - k8s.cronjob.name
          - k8s.job.name
          - k8s.pod.name
          - k8s.namespace.name
          - k8s.node.name
  receivers:
    jaeger: null
    prometheus: null
    zipkin: null
    otlp:
      protocols:
        grpc:
          endpoint: 0.0.0.0:4317
        http:
          endpoint: 0.0.0.0:4318
  service:
    telemetry:
      #logs for the collector itself
      logs:
        level: info
    extensions:
      - health_check
      - memory_ballast
    pipelines:
      traces:
        exporters:
          - spanmetrics
        processors:
          - memory_limiter
          - batch
          - k8sattributes
        receivers:
          - otlp
      logs: null
      metrics:
        exporters:
          - prometheus
        processors:
          - memory_limiter
          - batch
          - resource
          - transform
          - k8sattributes
        receivers:
          - otlp
          - spanmetrics
  connectors:
    spanmetrics:
      namespace: span.metrics

Log output

error   prometheusexporter@v0.85.0/accumulator.go:94    failed to translate metric  {"kind": "exporter", "data_type": "metrics", "name": "prometheus", "data_type": "\u0000", "metric_name": "http.client.request.size"}
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*lastValueAccumulator).addMetric
    github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter@v0.85.0/accumulator.go:94
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*lastValueAccumulator).Accumulate
    github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter@v0.85.0/accumulator.go:71
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*collector).processMetrics
    github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter@v0.85.0/collector.go:92
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*prometheusExporter).ConsumeMetrics
    github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter@v0.85.0/prometheus.go:85
go.opentelemetry.io/collector/exporter/exporterhelper.(*metricsRequest).Export
    go.opentelemetry.io/collector/exporter@v0.85.0/exporterhelper/metrics.go:60
go.opentelemetry.io/collector/exporter/exporterhelper.(*timeoutSender).send
    go.opentelemetry.io/collector/exporter@v0.85.0/exporterhelper/common.go:269
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseRequestSender).send
    go.opentelemetry.io/collector/exporter@v0.85.0/exporterhelper/common.go:54
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseRequestSender).send
    go.opentelemetry.io/collector/exporter@v0.85.0/exporterhelper/common.go:54
go.opentelemetry.io/collector/exporter/exporterhelper.(*metricsSenderWithObservability).send
    go.opentelemetry.io/collector/exporter@v0.85.0/exporterhelper/metrics.go:179
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseRequestSender).send
    go.opentelemetry.io/collector/exporter@v0.85.0/exporterhelper/common.go:54
go.opentelemetry.io/collector/exporter/exporterhelper.(*baseExporter).send
    go.opentelemetry.io/collector/exporter@v0.85.0/exporterhelper/common.go:216
go.opentelemetry.io/collector/exporter/exporterhelper.NewMetricsExporter.func1
    go.opentelemetry.io/collector/exporter@v0.85.0/exporterhelper/metrics.go:100
go.opentelemetry.io/collector/consumer.ConsumeMetricsFunc.ConsumeMetrics
    go.opentelemetry.io/collector/consumer@v0.85.0/metrics.go:25
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/resourcetotelemetry.(*wrapperMetricsExporter).ConsumeMetrics
    github.com/open-telemetry/opentelemetry-collector-contrib/pkg/resourcetotelemetry@v0.85.0/resource_to_telemetry.go:32
go.opentelemetry.io/collector/processor/processorhelper.NewMetricsProcessor.func1
    go.opentelemetry.io/collector/processor@v0.85.0/processorhelper/metrics.go:60
go.opentelemetry.io/collector/consumer.ConsumeMetricsFunc.ConsumeMetrics
    go.opentelemetry.io/collector/consumer@v0.85.0/metrics.go:25
go.opentelemetry.io/collector/processor/processorhelper.NewMetricsProcessor.func1
    go.opentelemetry.io/collector/processor@v0.85.0/processorhelper/metrics.go:60
go.opentelemetry.io/collector/consumer.ConsumeMetricsFunc.ConsumeMetrics
    go.opentelemetry.io/collector/consumer@v0.85.0/metrics.go:25
go.opentelemetry.io/collector/processor/processorhelper.NewMetricsProcessor.func1
    go.opentelemetry.io/collector/processor@v0.85.0/processorhelper/metrics.go:60
go.opentelemetry.io/collector/consumer.ConsumeMetricsFunc.ConsumeMetrics
    go.opentelemetry.io/collector/consumer@v0.85.0/metrics.go:25
go.opentelemetry.io/collector/processor/batchprocessor.(*batchMetrics).export
    go.opentelemetry.io/collector/processor/batchprocessor@v0.85.0/batch_processor.go:442
go.opentelemetry.io/collector/processor/batchprocessor.(*shard).sendItems
    go.opentelemetry.io/collector/processor/batchprocessor@v0.85.0/batch_processor.go:256
go.opentelemetry.io/collector/processor/batchprocessor.(*shard).start
    go.opentelemetry.io/collector/processor/batchprocessor@v0.85.0/batch_processor.go:218

Additional context

No response

github-actions[bot] commented 1 year ago

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

vpmedia commented 1 year ago

I'm experiencing the same exception happening quite frequently. Is "data_type": "\u0000" a valid data type? Just guessing..

crobert-1 commented 1 year ago

This is a duplicate of #13443, there's more discussion there.