payara / Payara

Payara Server is an open source middleware platform that supports reliable and secure deployments of Java EE (Jakarta EE) and MicroProfile applications in any environment: on premise, in the cloud or hybrid.
http://www.payara.fish
Other
883 stars 306 forks source link

Bug Report: /metrics - invalid prometheus format (HELP duplicates)/FISH-8522 #6596

Closed rafi0101 closed 1 month ago

rafi0101 commented 7 months ago

Brief Summary

if this sounds familiar to you, yes, I have oriented myself to this topic because it is almost the same error: https://github.com/payara/Payara/issues/4579

The default /metrics endpoint returns duplicated HELP entries which does not comply with Prometheus text-based exposition format. The specification under https://prometheus.io/docs/instrumenting/exposition_formats/#text-format-details says:

Only one HELP line may exist for any given metric name.

In my case Telegraf failed too:

2024-03-22T14:25:50Z E! [inputs.prometheus] Error in plugin: error reading metrics for "http://localhost:8080/metrics": decoding response failed: text format parsing error in line 176: second HELP line for metric name "gc_time_seconds_total"

We upgraded from payara 5.2022.3 to 6.2024.1 and this problem occours to all our Payara Servers and Payara Micros. Occurs with /base and /application metrics

Found this issues / pull requests but they didn't help me: https://github.com/eclipse/microprofile-metrics/issues/616 https://github.com/eclipse/microprofile-metrics/pull/638 https://github.com/eclipse/microprofile-metrics/issues/597

Maybe you can help me this this :)

Expected Outcome

payara 5.2022.3

# TYPE base_gc_time_total counter
# HELP base_gc_time_total Displays the approximate accumulated collection elapsed time in milliseconds. This attribute displays -1 if the collection elapsed time is undefined for this collector. The JVM implementation may use a high resolution timer to measure the elapsed time. This attribute may display the same value even if the collection count has been incremented if the collection elapsed time is very short.
base_gc_time_total{name="G1 Young Generation"} 817924
base_gc_time_total{name="G1 Old Generation"} 14374

# TYPE base_gc_total_total counter
# HELP base_gc_total_total Displays the total number of collections that have occurred. This attribute lists -1 if the collection count is undefined for this collector.
base_gc_total_total{name="G1 Young Generation"} 12002
base_gc_total_total{name="G1 Old Generation"} 33

Current Outcome

payara 6.2024.1 (Microprofile 6.1, Microprofile Metrics-API 5.1.0)

# TYPE gc_time_seconds_total counter
# HELP gc_time_seconds_total Displays the approximate accumulated collection elapsed time in milliseconds. This attribute displays -1 if the collection elapsed time is undefined for this collector. The JVM implementation may use a high resolution timer to measure the elapsed time. This attribute may display the same value even if the collection count has been incremented if the collection elapsed time is very short.
gc_time_seconds_total{mp_scope="base",name="G1 Young Generation"} 6.88
# HELP gc_time_seconds_total Displays the approximate accumulated collection elapsed time in milliseconds. This attribute displays -1 if the collection elapsed time is undefined for this collector. The JVM implementation may use a high resolution timer to measure the elapsed time. This attribute may display the same value even if the collection count has been incremented if the collection elapsed time is very short.
gc_time_seconds_total{mp_scope="base",name="G1 Old Generation"} 0.0

# TYPE gc_total counter
# HELP gc_total Displays the total number of collections that have occurred. This attribute lists -1 if the collection count is undefined for this collector.
gc_total{mp_scope="base",name="G1 Young Generation"} 102.0
# HELP gc_total Displays the total number of collections that have occurred. This attribute lists -1 if the collection count is undefined for this collector.
gc_total{mp_scope="base",name="G1 Old Generation"} 0.0

Reproducer

-

Operating System

Debian 11

JDK Version

openjdk-17-jre-headles

Payara Distribution

Payara Server Full Profile, Payara Micro

felixif commented 7 months ago

Hello @rafi0101,

I have verified the latest version of Payara and I can confirm that the issue is reproducible. The HELP section for gc_total and gt_time_seconds_total is duplicated. I have raised an internal issue, codename FISH-8522, and our developers will look into it as soon as they have the bandwidth. Thank you very much for your bug report!

Best regards, Felix Ifrim