microsoft / ApplicationInsights-Java

Application Insights for Java
http://aka.ms/application-insights
Other
296 stars 200 forks source link

valueMax reporting wrong number #2806

Open razum90 opened 1 year ago

razum90 commented 1 year ago

Expected behavior

I would expect valueMax to have a value less than valueSum.

Actual behavior

valueMax is greater than valueSum.

To Reproduce

This is my method that is called to gather metrics using micrometer:

@RequiredArgsConstructor
@Slf4j
public class TimeRecorder {
    private final static String METHOD = "method";
    private final static String CLASS = "class";
    private final MeterRegistry registry;

    public <T> T recorded(Supplier<T> supplier) {
        StackTraceElement[] stackTrace = Thread.currentThread().getStackTrace();

        if (stackTrace.length >= 3) {
            StackTraceElement stackTraceElement = stackTrace[2];

            return registry.timer("method.timing",
                            List.of(
                                    Tag.of(CLASS, stackTraceElement.getClassName()),
                                    Tag.of(METHOD, stackTraceElement.getMethodName())
                            ))
                    .record(supplier);
        } else {
            log.error("unable to determine callee, skipping metrics");
            return supplier.get();
        }
    }
}

System information

Please provide the following information:

Logs

Turn on SDK logs and attach/paste them to the issue. If using an application server, also attach any relevant server logs.

Be sure to remove any private information from the logs before posting!

Screenshots

If applicable, add screenshots to help explain your problem.

Metric

trask commented 1 year ago

hi @razum90, thanks for reporting this! I'm able to repro this locally, it looks like the valueMax is the max since JVM start instead of the max for the given interval. We will investigate further, thanks.

razum90 commented 1 year ago

Thanks @trask!

trask commented 1 year ago

hi @razum90, this looks like a micrometer bug: https://github.com/micrometer-metrics/micrometer/issues/3298

Application Insights Java also supports OpenTelemetry Metrics, which I confirmed doesn't have this issue:

    private static void micrometer() throws InterruptedException {
        Timer timer = Metrics.timer("test");
        timer.record(Duration.ofMillis(100));
        MINUTES.sleep(1);
        timer.record(Duration.ofMillis(10));
        MINUTES.sleep(1);
    }

    private static void opentelemetry() throws InterruptedException {
        LongHistogram timer = GlobalOpenTelemetry.getMeter("test").histogramBuilder("test").ofLongs().build();
        timer.record(100);
        MINUTES.sleep(1);
        timer.record(10);
        MINUTES.sleep(1);
    }

(unfortunately OpenTelemetry Metrics doesn't have a dedicated Timer yet)

ghost commented 1 year ago

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 7 days. It will be closed if no further activity occurs within 7 days of this comment.

razum90 commented 1 year ago

Hi @trask, oh okay. Will have to follow up that issue then. Thanks for looking into it.

jaschenk commented 1 year ago

Hello will this issue ever be fixed? As I have this same issue in the following environment:

JDK 17 Spring Boot 3.0.5 Micrometer Core: [INFO] +- io.micrometer:micrometer-core:jar:1.10.5:compile [INFO] | +- io.micrometer:micrometer-commons:jar:1.10.5:compile [INFO] | +- io.micrometer:micrometer-observation:jar:1.10.5:compile [INFO] | +- org.hdrhistogram:HdrHistogram:jar:2.1.12:runtime [INFO] | - org.latencyutils:LatencyUtils:jar:2.0.3:runtime

Please advise if there is a work around, as we rely heavily on these metric capabilities.

Efforts to squash most appreciated. Any additional information I can provide, please advise.

Sincerely, Jeff Schenk

trask commented 1 year ago

Hello will this issue ever be fixed?

hi @razum90! we are planning to replace our existing Micrometer instrumentation with the more recent upstream OpenTelemetry Micrometer instrumentation, which re-implements the Micrometer API using the OpenTelemetry Metrics SDK implementation, and therefore should avoid the above Micrometer bug.

(another option is to use the OpenTelemetry Metrics API directly, depending on how integrated your Micrometer usage is)

jaschenk commented 1 year ago

@trask That is good information. In interim I created a simple TimerWrapper object. Will review the OpenTelemetry Metrics SDK implementation for moving forward.

Many thanks, Jeff