micrometer-metrics / micrometer

An application observability facade for the most popular observability tools. Think SLF4J, but for observability.
https://micrometer.io
Apache License 2.0
4.39k stars 966 forks source link

Metric will be skipped when producing scrape output for PrometheusMeterRegistry #5192

Closed danielksb closed 2 weeks ago

danielksb commented 3 weeks ago

Describe the bug If two metrics have the same name but different tags, they will be registered in the PrometheusMeterRegistry as two distinct meters. After calling PrometheusMeterRegistry::scrape one of the two meters will not be part of the output. See below for an example.

Environment

To Reproduce How to reproduce the bug:

package org.example;

import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.Meter;
import io.micrometer.prometheusmetrics.PrometheusConfig;
import io.micrometer.prometheusmetrics.PrometheusMeterRegistry;

import java.util.stream.Collectors;

public class Main {
    public static void main(String[] args) {
        PrometheusMeterRegistry meterRegistry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);
        Counter.builder("test")
               .tag("status", "aaa")
               .register(meterRegistry)
               .increment();

        Counter.builder("test")
               .tag("status", "bbb")
               .tag("tag_for_status_bbb", "xyz") // adding an additional tag seems to be the cause of the problem
               .register(meterRegistry)
               .increment();

        // We check the content of the registry by simply printing the IDs of the registered meters.
        System.out.println(
                meterRegistry.getMeters().stream()
                             .map(Meter::getId)
                             .map(Meter.Id::toString)
                             .collect(Collectors.joining("\n")));
        // Output will be:
        // MeterId{name='test', tags=[tag(status=aaa)]}
        // MeterId{name='test', tags=[tag(status=bbb),tag(tag_for_status_bbb=xyz)]}

        // The registry contains two meters, this is the expected behaviour.
        // Now we create the output for the Prometheus scrape API.

        System.out.println(meterRegistry.scrape());
        // Output will be:
        // # HELP test_total
        // # TYPE test_total counter
        // test_total{status="aaa"} 1.0

        // The entry for the second meter is missing.
    }
}

Expected behavior The function call to PrometheusMeterRegistry::scrape should contain entries for all registered meters:

# HELP test_total  
# TYPE test_total counter
test_total{status="aaa"} 1.0
test_total{status="bbb",tag_for_status_bbb="xyz" } 1.0
ppussar commented 3 weeks ago

I can confirm this bug in 1.13.0

ppussar commented 3 weeks ago

Downgrading the Prometheus Java client as mentioned here solves the problem: https://github.com/micrometer-metrics/micrometer/wiki/1.13-Migration-Guide

danielksb commented 3 weeks ago

I have tried downgrading to another version using the following pom.xml and the problem persists:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>org.example</groupId>
  <artifactId>micrometer-bug</artifactId>
  <version>1.0-SNAPSHOT</version>

  <properties>
    <maven.compiler.source>19</maven.compiler.source>
    <maven.compiler.target>19</maven.compiler.target>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <dependency>
      <groupId>io.micrometer</groupId>
      <artifactId>micrometer-registry-prometheus</artifactId>
      <version>1.13.0</version>
    </dependency>
  </dependencies>

</project>

When I switch to version 1.6.0 I still get the same behaviour.

When I switch to version 1.5.0 I receive the following error:

Exception in thread "main" java.lang.IllegalArgumentException: Prometheus requires that all meters with the same name have the same set of tag keys. There is already an existing meter named 'test_total' containing tag keys [status]. The meter you are attempting to register has keys [status, tag_for_status_bbb].
    at io.micrometer.prometheus.PrometheusMeterRegistry.lambda$applyToCollector$17(PrometheusMeterRegistry.java:429)
    at java.base/java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1940)
    at io.micrometer.prometheus.PrometheusMeterRegistry.applyToCollector(PrometheusMeterRegistry.java:413)
    at io.micrometer.prometheus.PrometheusMeterRegistry.newCounter(PrometheusMeterRegistry.java:105)
    at io.micrometer.core.instrument.MeterRegistry.lambda$registerMeterIfNecessary$5(MeterRegistry.java:559)
    at io.micrometer.core.instrument.MeterRegistry.getOrCreateMeter(MeterRegistry.java:612)
    at io.micrometer.core.instrument.MeterRegistry.registerMeterIfNecessary(MeterRegistry.java:566)
    at io.micrometer.core.instrument.MeterRegistry.registerMeterIfNecessary(MeterRegistry.java:559)
    at io.micrometer.core.instrument.MeterRegistry.counter(MeterRegistry.java:282)
    at io.micrometer.core.instrument.Counter$Builder.register(Counter.java:128)
    at org.example.Main.main(Main.java:21)

Having a look at the release notes for 1.6.0 doesn't reveal a change in the behaviour of tag keys. So my guess is, that the error message is correct and should have been thrown in the versions 1.6.0+, but disappeared for some reason.

Can someone confirm the correct behaviour? Is Prometheus able to handle meters with the same name but different tag keys?

ppussar commented 3 weeks ago

After some debugging I figured out that the problem was the total suffix of one of my metrics. I have two metrics:

which leads with the new prometheus version to one metric abc(tags=[]) as the new behaviour of the Registry is to remove the total suffix. The second metric is not processed as the tags are not equal. @danielksb the tag keys must be equal if you register a metric with the same name. Otherwise it is ignored.

danielksb commented 3 weeks ago

I checked the Prometheus spec on writing client libraries: https://prometheus.io/docs/instrumenting/writing_clientlibs/#labels

They mention:

Client libraries MUST NOT allow users to have different label names for the same metric for Gauge/Counter/Summary/Histogram or any other Collector offered by the library.

This means my expectations were wrong. However prio to version 1.5 there was an error message when creating an invalid meter. Should I open a ticket for adding the error message again to warn users about their mistake or was the error message deliberately removed?

jonatan-ivanov commented 2 weeks ago

Duplicate of https://github.com/micrometer-metrics/micrometer/issues/877

jonatan-ivanov commented 2 weeks ago

Doing this is discouraged by the Prometheus server and validated by the Prometheus client. Instead of this:

test_total{status="aaa"} 1.0
test_total{status="bbb",tag_for_status_bbb="xyz" } 1.0

you should do this:

test_total{status="aaa",tag_for_status_bbb="none" } 1.0
test_total{status="bbb",tag_for_status_bbb="xyz" } 1.0

Please let me know if this is not the case (duplicate of https://github.com/micrometer-metrics/micrometer/issues/877) and we can reopen this issue.

jonatan-ivanov commented 2 weeks ago

Should I open a ticket for adding the error message again to warn users about their mistake or was the error message deliberately removed?

It was intentionally removed (users asked for it).

shakuzen commented 1 week ago

Should I open a ticket for adding the error message again to warn users about their mistake or was the error message deliberately removed?

It was intentional as @jonatan-ivanov mentioned, but see the corresponding documentation. You can configure a callback for meter registration failure that logs and/or throws an exception.