micrometer-metrics / micrometer

An application observability facade for the most popular observability tools. Think SLF4J, but for observability.
https://micrometer.io
Apache License 2.0
4.39k stars 966 forks source link

ClassCastException io.prometheus.metrics.model.snapshots.HistogramSnapshot$HistogramDataPointSnapshot vs io.prometheus.metrics.model.snapshots.SummarySnapshot$SummaryDataPointSnapshot when scraping two PrometheusTimers, one with publishPercentile, the other one without after upgrade to Micrometer 1.13 #5194

Closed jensbaitingerbosch closed 2 weeks ago

jensbaitingerbosch commented 3 weeks ago

Describe the bug

When there are 2 Timers with the same name but different tag set, one with enabled histogram, the other without and you scape the metrics using the PrometheusTextFormatWriter (as the Spring Actuator Endpoint does), the following classcastexception is thrown:

java.lang.ClassCastException: class io.prometheus.metrics.model.snapshots.HistogramSnapshot$HistogramDataPointSnapshot cannot be cast to class io.prometheus.metrics.model.snapshots.SummarySnapshot$SummaryDataPointSnapshot (io.prometheus.metrics.model.snapshots.HistogramSnapshot$HistogramDataPointSnapshot and io.prometheus.metrics.model.snapshots.SummarySnapshot$SummaryDataPointSnapshot are in unnamed module of loader 'app')
    at io.prometheus.metrics.expositionformats.PrometheusTextFormatWriter.writeSummary(PrometheusTextFormatWriter.java:208)
    at io.prometheus.metrics.expositionformats.PrometheusTextFormatWriter.write(PrometheusTextFormatWriter.java:70)
    at org.springframework.boot.actuate.metrics.export.prometheus.PrometheusOutputFormat$1.write(PrometheusOutputFormat.java:47)
    at org.springframework.boot.actuate.metrics.export.prometheus.PrometheusScrapeEndpoint.scrape(PrometheusScrapeEndpoint.java:60)
    at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
    at java.base/java.lang.reflect.Method.invoke(Method.java:580)
    at org.springframework.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:281)
    at org.springframework.boot.actuate.endpoint.invoke.reflect.ReflectiveOperationInvoker.invoke(ReflectiveOperationInvoker.java:74)
    at org.springframework.boot.actuate.endpoint.annotation.AbstractDiscoveredOperation.invoke(AbstractDiscoveredOperation.java:60)
    at org.springframework.boot.actuate.endpoint.web.servlet.AbstractWebMvcEndpointHandlerMapping$ServletWebOperationAdapter.handle(AbstractWebMvcEndpointHandlerMapping.java:327)
    at org.springframework.boot.actuate.endpoint.web.servlet.AbstractWebMvcEndpointHandlerMapping$OperationHandler.handle(AbstractWebMvcEndpointHandlerMapping.java:434)
    at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
    at java.base/java.lang.reflect.Method.invoke(Method.java:580)
    at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:255)
    at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:188)
    at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:118)
    at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:926)
    at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:831)
    at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
    at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1089)
    at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:979)
    at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1014)
    at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:903)
    at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:564)
    at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:885)
    at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:658)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:195)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:140)
    at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:164)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:140)
    at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:164)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:140)
    at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:164)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:140)
    at org.springframework.web.filter.ServerHttpObservationFilter.doFilterInternal(ServerHttpObservationFilter.java:109)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:164)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:140)
    at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201)
    at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:164)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:140)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:167)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:90)
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:482)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:115)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:93)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:344)
    at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:389)
    at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63)
    at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:896)
    at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1741)
    at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52)
    at java.base/java.lang.VirtualThread.run(VirtualThread.java:309)

Environment

To Reproduce How to reproduce the bug:

    PrometheusRegistry prometheusRegistry = new PrometheusRegistry();
    PrometheusMeterRegistry registry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT, prometheusRegistry, Clock.SYSTEM);

    registry.timer("timer", "tag", "one");

    Timer.builder("timer")
        .tag("tag", "two")
        .publishPercentileHistogram(true)
        .register(registry);

    PrometheusTextFormatWriter writer = new PrometheusTextFormatWriter(false);
    writer.write(System.out, prometheusRegistry.scrape());

Expected behavior it produces a valid output to that can be parsed by prometheus like:

# HELP timer_seconds  
# TYPE timer_seconds histogram
timer_seconds_count{tag="one"} 0
timer_seconds_sum{tag="one"} 0.0
timer_seconds_bucket{tag="two",le="0.001"} 0
timer_seconds_bucket{tag="two",le="0.001048576"} 0
timer_seconds_bucket{tag="two",le="0.001398101"} 0
timer_seconds_bucket{tag="two",le="0.001747626"} 0
timer_seconds_bucket{tag="two",le="0.002097151"} 0
timer_seconds_bucket{tag="two",le="0.002446676"} 0
timer_seconds_bucket{tag="two",le="0.002796201"} 0
timer_seconds_bucket{tag="two",le="0.003145726"} 0
timer_seconds_bucket{tag="two",le="0.003495251"} 0
timer_seconds_bucket{tag="two",le="0.003844776"} 0
timer_seconds_bucket{tag="two",le="0.004194304"} 0
timer_seconds_bucket{tag="two",le="0.005592405"} 0
timer_seconds_bucket{tag="two",le="0.006990506"} 0
timer_seconds_bucket{tag="two",le="0.008388607"} 0
timer_seconds_bucket{tag="two",le="0.009786708"} 0
timer_seconds_bucket{tag="two",le="0.011184809"} 0
timer_seconds_bucket{tag="two",le="0.01258291"} 0
timer_seconds_bucket{tag="two",le="0.013981011"} 0
timer_seconds_bucket{tag="two",le="0.015379112"} 0
timer_seconds_bucket{tag="two",le="0.016777216"} 0
timer_seconds_bucket{tag="two",le="0.022369621"} 0
timer_seconds_bucket{tag="two",le="0.027962026"} 0
timer_seconds_bucket{tag="two",le="0.033554431"} 0
timer_seconds_bucket{tag="two",le="0.039146836"} 0
timer_seconds_bucket{tag="two",le="0.044739241"} 0
timer_seconds_bucket{tag="two",le="0.050331646"} 0
timer_seconds_bucket{tag="two",le="0.055924051"} 0
timer_seconds_bucket{tag="two",le="0.061516456"} 0
timer_seconds_bucket{tag="two",le="0.067108864"} 0
timer_seconds_bucket{tag="two",le="0.089478485"} 0
timer_seconds_bucket{tag="two",le="0.111848106"} 0
timer_seconds_bucket{tag="two",le="0.134217727"} 0
timer_seconds_bucket{tag="two",le="0.156587348"} 0
timer_seconds_bucket{tag="two",le="0.178956969"} 0
timer_seconds_bucket{tag="two",le="0.20132659"} 0
timer_seconds_bucket{tag="two",le="0.223696211"} 0
timer_seconds_bucket{tag="two",le="0.246065832"} 0
timer_seconds_bucket{tag="two",le="0.268435456"} 0
timer_seconds_bucket{tag="two",le="0.357913941"} 0
timer_seconds_bucket{tag="two",le="0.447392426"} 0
timer_seconds_bucket{tag="two",le="0.536870911"} 0
timer_seconds_bucket{tag="two",le="0.626349396"} 0
timer_seconds_bucket{tag="two",le="0.715827881"} 0
timer_seconds_bucket{tag="two",le="0.805306366"} 0
timer_seconds_bucket{tag="two",le="0.894784851"} 0
timer_seconds_bucket{tag="two",le="0.984263336"} 0
timer_seconds_bucket{tag="two",le="1.073741824"} 0
timer_seconds_bucket{tag="two",le="1.431655765"} 0
timer_seconds_bucket{tag="two",le="1.789569706"} 0
timer_seconds_bucket{tag="two",le="2.147483647"} 0
timer_seconds_bucket{tag="two",le="2.505397588"} 0
timer_seconds_bucket{tag="two",le="2.863311529"} 0
timer_seconds_bucket{tag="two",le="3.22122547"} 0
timer_seconds_bucket{tag="two",le="3.579139411"} 0
timer_seconds_bucket{tag="two",le="3.937053352"} 0
timer_seconds_bucket{tag="two",le="4.294967296"} 0
timer_seconds_bucket{tag="two",le="5.726623061"} 0
timer_seconds_bucket{tag="two",le="7.158278826"} 0
timer_seconds_bucket{tag="two",le="8.589934591"} 0
timer_seconds_bucket{tag="two",le="10.021590356"} 0
timer_seconds_bucket{tag="two",le="11.453246121"} 0
timer_seconds_bucket{tag="two",le="12.884901886"} 0
timer_seconds_bucket{tag="two",le="14.316557651"} 0
timer_seconds_bucket{tag="two",le="15.748213416"} 0
timer_seconds_bucket{tag="two",le="17.179869184"} 0
timer_seconds_bucket{tag="two",le="22.906492245"} 0
timer_seconds_bucket{tag="two",le="28.633115306"} 0
timer_seconds_bucket{tag="two",le="30.0"} 0
timer_seconds_bucket{tag="two",le="+Inf"} 0
timer_seconds_count{tag="two"} 0
timer_seconds_sum{tag="two"} 0.0
# HELP timer_seconds_max  
# TYPE timer_seconds_max gauge
timer_seconds_max{tag="one"} 0.0
timer_seconds_max{tag="two"} 0.0

Additional context Add any other context about the problem here, e.g. related issues.

jensbaitingerbosch commented 3 weeks ago

This might be caused by the differences between https://github.com/micrometer-metrics/micrometer/blob/23b6c43d9ce7a758dd5aa4620c776358f3a86039/implementations/micrometer-registry-prometheus/src/main/java/io/micrometer/prometheusmetrics/PrometheusMeterRegistry.java#L252 and https://github.com/micrometer-metrics/micrometer/blob/23b6c43d9ce7a758dd5aa4620c776358f3a86039/implementations/micrometer-registry-prometheus/src/main/java/io/micrometer/prometheusmetrics/PrometheusMeterRegistry.java#L279, where a HistogramSnapshot is returned or a SummaryDataPointSnapshot depending if the histogram is enabled. Therefor this bug might also occur for other DistributionSummaries not only Timers

jonatan-ivanov commented 2 weeks ago

Thank you for the issue and extra thanks for the minimal Java reproducer! I think this is a duplicate of https://github.com/micrometer-metrics/micrometer/issues/5150 so let me close it and continue the discussion there, please let us know if you disagree and we can reopen.

jonatan-ivanov commented 2 weeks ago

Duplicate of https://github.com/micrometer-metrics/micrometer/issues/5150