open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.89k stars 825 forks source link

Explain difference between http.server.requests.max (gauge) vs http.server.requests (histogram's "max") #11845

Open artemik opened 1 month ago

artemik commented 1 month ago

Describe the bug

On a Spring Boot app with Micrometer instrumentation, with all the defaults, OpenTelemetry outputs the following metrics:

The gauge is apparently coming from here: https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/9ecf7965aa455d41ea8cc0761b6c6b6eeb106324/instrumentation/micrometer/micrometer-1.5/library/src/main/java/io/opentelemetry/instrumentation/micrometer/v1_5/OpenTelemetryTimer.java#L80

2024-07-17T16:42:06.275Z  INFO 11576 --- [backend] [cMetricReader-1] i.o.e.logging.LoggingMetricExporter      : metric: ImmutableMetricData{resource=Resource{schemaUrl=https://opentelemetry.io/schemas/1.24.0, attributes={host.arch="amd64", host.name="DESKTOP", os.description="Windows 10 10.0", os.type="windows", process.command_line="C:\Program Files\Eclipse Adoptium\jdk-17.0.10.7-hotspot\bin\java.exe -agentlib:jdwp=transport=dt_socket,address=127.0.0.1:11799,suspend=y,server=n -Dotel.instrumentation.micrometer.enabled=true -Dotel.instrumentation.jdbc-datasource.enabled=true -Dotel.instrumentation.spring-webmvc.enabled=true -Dotel.metrics.exporter=console -Dotel.jmx.target.system=tomcat -XX:TieredStopAtLevel=1 -Dspring.output.ansi.enabled=always -Dcom.sun.management.jmxremote -Dspring.jmx.enabled=true -Dspring.liveBeansView.mbeanDomain -Dspring.application.admin.enabled=true -Dmanagement.endpoints.jmx.exposure.include=* -javaagent:C:\Users\Artem\AppData\Local\JetBrains\IntelliJIdea2024.1\captureAgent\debugger-agent.jar=file:/C:/Users/Artem/AppData/Local/Temp/capture.props -Dfile.encoding=UTF-8 com.backend.BackendApplication", process.executable.path="C:\Program Files\Eclipse Adoptium\jdk-17.0.10.7-hotspot\bin\java.exe", process.pid=11576, process.runtime.description="Eclipse Adoptium OpenJDK 64-Bit Server VM 17.0.10+7", process.runtime.name="OpenJDK Runtime Environment", process.runtime.version="17.0.10+7", service.instance.id="e5198df6-aa26-469a-a58a-19c99409ac27", service.name="backend", telemetry.distro.name="opentelemetry-spring-boot-starter", telemetry.distro.version="2.5.0-alpha", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.39.0"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=io.opentelemetry.micrometer-1.5, version=null, schemaUrl=null, attributes={}}, name=http.server.requests, description=, unit=s, type=HISTOGRAM, data=ImmutableHistogramData{aggregationTemporality=CUMULATIVE, points=[ImmutableHistogramPointData{getStartEpochNanos=1721233445765907200, getEpochNanos=1721234526244083100, getAttributes={error="none", exception="none", method="GET", outcome="SUCCESS", status="200", uri="/rest/services/onboarding"}, getSum=0.12368740099999999, getCount=3, hasMin=true, getMin=0.009611, hasMax=true, getMax=0.100249601, getBoundaries=[], getCounts=[3], getExemplars=[]}, ...]}}

2024-07-17T16:42:06.260Z  INFO 11576 --- [backend] [cMetricReader-1] i.o.e.logging.LoggingMetricExporter      : metric: ImmutableMetricData{resource=Resource{schemaUrl=https://opentelemetry.io/schemas/1.24.0, attributes={host.arch="amd64", host.name="DESKTOP", os.description="Windows 10 10.0", os.type="windows", process.command_line="C:\Program Files\Eclipse Adoptium\jdk-17.0.10.7-hotspot\bin\java.exe -agentlib:jdwp=transport=dt_socket,address=127.0.0.1:11799,suspend=y,server=n -Dotel.instrumentation.micrometer.enabled=true -Dotel.instrumentation.jdbc-datasource.enabled=true -Dotel.instrumentation.spring-webmvc.enabled=true -Dotel.metrics.exporter=console -Dotel.jmx.target.system=tomcat -XX:TieredStopAtLevel=1 -Dspring.output.ansi.enabled=always -Dcom.sun.management.jmxremote -Dspring.jmx.enabled=true -Dspring.liveBeansView.mbeanDomain -Dspring.application.admin.enabled=true -Dmanagement.endpoints.jmx.exposure.include=* -javaagent:C:\Users\Artem\AppData\Local\JetBrains\IntelliJIdea2024.1\captureAgent\debugger-agent.jar=file:/C:/Users/Artem/AppData/Local/Temp/capture.props -Dfile.encoding=UTF-8 com.backend.BackendApplication", process.executable.path="C:\Program Files\Eclipse Adoptium\jdk-17.0.10.7-hotspot\bin\java.exe", process.pid=11576, process.runtime.description="Eclipse Adoptium OpenJDK 64-Bit Server VM 17.0.10+7", process.runtime.name="OpenJDK Runtime Environment", process.runtime.version="17.0.10+7", service.instance.id="e5198df6-aa26-469a-a58a-19c99409ac27", service.name="backend", telemetry.distro.name="opentelemetry-spring-boot-starter", telemetry.distro.version="2.5.0-alpha", telemetry.sdk.language="java", telemetry.sdk.name="opentelemetry", telemetry.sdk.version="1.39.0"}}, instrumentationScopeInfo=InstrumentationScopeInfo{name=io.opentelemetry.micrometer-1.5, version=null, schemaUrl=null, attributes={}}, name=http.server.requests.max, description=, unit=s, type=DOUBLE_GAUGE, data=ImmutableGaugeData{points=[ImmutableDoublePointData{startEpochNanos=1721233445765907200, epochNanos=1721234526244083100, attributes={error="none", exception="none", method="GET", outcome="SUCCESS", status="200", uri="/rest/services/onboarding"}, value=0.0, exemplars=[]}, ...]}}

The difference between the "max"es seems to be:

It seems that a time window gauge metric gets added to all other histogram metrics as well.

Steps to reproduce

-

Expected behavior

  1. Can anybody explain why OpenTelemetry Micrometer instrumentation adds a time window "max" gauge?
  2. The "http.server.requests.max" time window gauge becomes "http_server_requests_max" in prometheus and therefore overrides the histogram's max value because it would have the same name, as I understand. So it's impossible (out of the box) to get histogram's max value (in case anybody would need it). Is it on purpose? Is it going to stay like that in future?
  3. Such setup is very confusing. Can there be some documentation explaining that it's happening and why?

I'm creating it as a bug because it's not clear if it's a documentation issue only and it pertains to Java instrumentation specifically.

Actual behavior

-

Javaagent or library instrumentation version

2.5.0

Environment

JDK: 17 OS: Windows 10

Additional context

No response

trask commented 1 month ago

hi @artemik! I dug around and found where the feature was added, can you check out #5303 and the discussion in #5292 to see if that helps at all?