open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.88k stars 823 forks source link

The type of kafka_producer_connection_count keeps changing between counter and gauge #7302

Open tuhao1020 opened 1 year ago

tuhao1020 commented 1 year ago

OpenTelemetry java agent version: 1.20.2 Kafka version: 3.1.1 OpenTelemetry Collector version: 0.66.0

OpenTelemetry Collector config file:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:14317
  otlp/dummy: # Dummy receiver for the metrics pipeline
    protocols:
      grpc:
        endpoint: localhost:65535

processors:
  servicegraph:
    metrics_exporter: prometheus/servicegraph # Exporter to send metrics to
    dimensions: [cluster, namespace] # Additional dimensions (labels) to be added to the metrics extracted from the resource and span attributes
    store: # Configuration for the in-memory store
      ttl: 2s # Value to wait for an edge to be completed
      max_items: 200 # Amount of edges that will be stored in the storeMap      

exporters:
  prometheus/servicegraph:
    endpoint: 0.0.0.0:9091  # to prometheus
  otlp:
    endpoint: http://localhost:4317  # to jaeger
    tls:
      insecure: true 
  logging:
    logLevel: debug    

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [servicegraph]
      exporters: [logging, otlp]
    metrics/servicegraph:
      receivers: [otlp]
      processors: []
      exporters: [prometheus/servicegraph]

refresh http://localhost:9091/metrics in browser, I find that kafka_producer_connection_count keeps changing between counter and gauge

# HELP kafka_producer_connection_count The current number of active connections.
# TYPE kafka_producer_connection_count counter
kafka_producer_connection_count{client_id="producer-1",job="otel-demo-provider",kafka_version="3.1.1",spring_id="kafkaProducerFactory.producer-1"} 1
# HELP kafka_producer_connection_count The current number of active connections.
# TYPE kafka_producer_connection_count gauge
kafka_producer_connection_count{client_id="producer-1",job="otel-demo-provider"} 1
mateuszrzeszutek commented 1 year ago

Hey @tuhao1020 , What kind of metrics does the javaagent export? Excluding the collector? Let's make sure there's no interference on the collector side first.

tuhao1020 commented 1 year ago

@mateuszrzeszutek The metrics exported by the Java agent always keep the gauge type, you mean the collector modified the type? Theoretically, collector does not modify this type, right?

mateuszrzeszutek commented 1 year ago

Honestly, I've no idea if the collector modifies it or not - which is why we should first try to pinpoint which of these two (agent, collector) causes this to happen.

tuhao1020 commented 1 year ago

@mateuszrzeszutek #7271 Does it have anything to do with this? I'm using kafka 3.3.1, but kafka_version of the metrics are 3.1.1

mateuszrzeszutek commented 1 year ago

No, that PR is about Spring Kafka, it has a different versioning scheme from Kafka.

jojotong commented 1 year ago

Same problem.

fyuan1316 commented 3 months ago

Hi @mateuszrzeszutek I still reproduce this error in a simple Spring-kafka demo. (https://github.com/MaheshIare/spring-boot-kafka-demo/tree/master?tab=readme-ov-file)

Any ideas on how to troubleshoot this?


Java agent version : 1.24.0 Using the configuration below: Java Env config:

OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://localhost:14318;OTEL_EXPORTER_PROMETHEUS_PORT=10000;OTEL_METRICS_EXPORTER=otlp;OTEL_SERVICE_NAME=test-kfk

CLI arguments:

-Dotel.instrumentation.runtime-metrics.experimental-metrics.enabled=true

VM:

-javaagent:/Users/yuan/Dev/IdeaProjects/otel-java-instrumentation/alauda-extension/build/libs/opentelemetry-javaagent-ext.jar

OTel-Collector version: 0.100.0 Config:

extensions:
# The health_check extension is mandatory for this chart.
# Without the health_check extension the collector will fail the readiness and liveliness probes.
# The health_check extension can be modified, but should never be removed.
  health_check: {}
  memory_ballast:
    size_in_percentage: 40
receivers:
  otlp/traces:
    protocols:
      grpc:
        endpoint: :14317
  otlp/metrics:
    protocols:
      grpc:
        endpoint: :14318
  zipkin:

exporters:
  logging:
    loglevel: info
  otlp/metrics:
    endpoint: :14318
    tls:
      insecure: true
  prometheus:
    endpoint: :8889
service:
  extensions:
    - health_check
    - memory_ballast
  telemetry:
    logs:
      level: info
    metrics:
      level: detailed
      address: :8888
  pipelines:
    metrics:
      receivers: [otlp/metrics]
      exporters: [prometheus]

OTel Collector log as follows:

2024-05-29T18:44:55.442+0800    error   prometheusexporter@v0.82.0/log.go:23    error gathering metrics: collected metric kafka_consumer_connection_count label:{name:"client_id"  value:"consumer-c92f3eab-2f4f-4e96-a394-1983d69e24ae-0"}  label:{name:"job"  value:"test-kfk"}  gauge:{value:2} should be a Counter
        {"kind": "exporter", "data_type": "metrics", "name": "prometheus"}
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*promLogger).Println
        github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter@v0.82.0/log.go:23
github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1
        github.com/prometheus/client_golang@v1.16.0/prometheus/promhttp/http.go:144
net/http.HandlerFunc.ServeHTTP
        net/http/server.go:2122
net/http.(*ServeMux).ServeHTTP
        net/http/server.go:2500
go.opentelemetry.io/collector/config/confighttp.(*decompressor).ServeHTTP
        go.opentelemetry.io/collector/config/confighttp@v0.82.0/compression.go:147
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP
        go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.42.0/handler.go:212
go.opentelemetry.io/collector/config/confighttp.(*clientInfoHandler).ServeHTTP
        go.opentelemetry.io/collector/config/confighttp@v0.82.0/clientinfohandler.go:28
net/http.serverHandler.ServeHTTP
        net/http/server.go:2936
net/http.(*conn).serve
        net/http/server.go:1995
2024-05-29T18:44:55.443+0800    error   prometheusexporter@v0.82.0/log.go:23    error gathering metrics: collected metric kafka_consumer_connection_count label:{name:"client_id"  value:"consumer-c92f3eab-2f4f-4e96-a394-1983d69e24ae-0"}  label:{name:"job"  value:"test-kfk"}  label:{name:"kafka_version"  value:"2.6.0"}  label:{name:"spring_id"  value:"kafkaConsumerFactory.consumer-c92f3eab-2f4f-4e96-a394-1983d69e24ae-0"}  counter:{value:2} should be a Gauge
        {"kind": "exporter", "data_type": "metrics", "name": "prometheus"}
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter.(*promLogger).Println
        github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusexporter@v0.82.0/log.go:23
github.com/prometheus/client_golang/prometheus/promhttp.HandlerForTransactional.func1
        github.com/prometheus/client_golang@v1.16.0/prometheus/promhttp/http.go:144
net/http.HandlerFunc.ServeHTTP
        net/http/server.go:2122
net/http.(*ServeMux).ServeHTTP
        net/http/server.go:2500