prometheus / common

Go libraries shared across Prometheus components and libraries.
Apache License 2.0
262 stars 315 forks source link

MetricFamilyToOpenMetrics produces output that cannot be parsed #319

Open fstab opened 3 years ago

fstab commented 3 years ago

I am debugging https://github.com/prometheus/jmx_exporter/issues/621, and I found I can reproduce the root cause of the error with the following test in github.com/prometheus/common/expfmt:

package expfmt

import (
    "bytes"
    "strings"
    "testing"
)

func TestJvmClassesLoaded(t *testing.T) {
    in := `
# HELP jvm_classes_loaded The number of classes that are currently loaded in the JVM
# TYPE jvm_classes_loaded gauge
jvm_classes_loaded 6129.0
# HELP jvm_classes_loaded_total The total number of classes that have been loaded since the JVM has started execution
# TYPE jvm_classes_loaded_total counter
jvm_classes_loaded_total 6143.0
`
    metricFamilies, err := parser.TextToMetricFamilies(strings.NewReader(in))
    if err != nil {
        t.Error(err)
    }
    openMetricsOutput := bytes.NewBuffer(make([]byte, 0))
    for _, metricFamily := range metricFamilies {
        if _, err := MetricFamilyToOpenMetrics(openMetricsOutput, metricFamily); err != nil {
            t.Fatal(err)
        }
    }
    if _, err := FinalizeOpenMetrics(openMetricsOutput); err != nil {
        t.Fatal(err)
    }
    _, err = parser.TextToMetricFamilies(strings.NewReader(openMetricsOutput.String()))
    if err != nil {
        t.Error(err)
    }
}

It fails with

=== RUN   TestJvmClassesLoaded
    jvm_classes_loaded_test.go:33: text format parsing error in line 4: second HELP line for metric name "jvm_classes_loaded"
--- FAIL: TestJvmClassesLoaded (0.00s)
FAIL
FAIL    github.com/prometheus/common/expfmt     0.002s
FAIL

As the error message indicates, the reason is that the _total suffix is truncated in HELP and TYPE (as defined in #214), and the resulting "short name" conflicts with the name of the gauge.

Two full serialization steps are needed to trigger the error (OpenMetrics format -> Go objects -> OpenMetrics format -> Go objects), so this might seem like an issue that doesn't occur in practice. However, in https://github.com/prometheus/jmx_exporter/issues/621 this happens because of the following setup:

jmx_exporter -> exporter_exporter -> Prometheus server

I'm not sure what to do about this. Should exporters prevent metrics from having the same "short name"?

beorn7 commented 3 years ago

There is a similar issue over in client_golang: https://github.com/prometheus/client_golang/issues/829

I have run into this use case often enough by now (seeing it in core Prometheus repos is just the tip of the iceberg) that I see it as a quite seriously breaking change introduced by OpenMetrics. Given that OpenMetrics tries to be "just what Prometheus has done so far and what has proven to work well", I can only encourage OpenMetrics to reconsider the behavior here.