eclipse / microprofile-metrics

microprofile-metrics
Apache License 2.0
100 stars 66 forks source link

OpenMetrics exporter does not respect openmetrics format #597

Closed rmannibucau closed 2 years ago

rmannibucau commented 4 years ago

Hi,

Not sure I missed something but using https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format as a reference, it seems mp-metrics does not match the exporter format due to the numerous metric types. To be portable you must only use counters and gauges - even histograms are not matching. Wonder if metrics shouldn't be split in metrics-openmetrics (agree name sucks but it is what it is) and metrics which would import the openmetrics part. Think it is ok to just have counters and gauges in the openmetrics sub-api (this is the main usage since all other ones can be compensated by the backend).

trotman23 commented 3 years ago

Also ran in to this today, specifically with the base_gc_total counter, but there are plenty of other duplicates.

...
# HELP base_gc_total Displays the total number of collections that have occurred. This attribute lists -1 if the collection count is undefined for this collector.
# TYPE base_gc_total counter
base_gc_total{name="PS MarkSweep1"} 5.0
...
# HELP base_gc_total Displays the total number of collections that have occurred. This attribute lists -1 if the collection count is undefined for this collector.
# TYPE base_gc_total counter
base_gc_total{name="PS Scavenge1"} 79.0

As pointed out above in the spec: https://prometheus.io/docs/instrumenting/exposition_formats/#comments-help-text-and-type-information

Lines with a # as the first non-whitespace character are comments. They are ignored unless the first token after # is either HELP or TYPE. Those lines are treated as follows: If the token is HELP, at least one more token is expected, which is the metric name. All remaining tokens are considered the docstring for that metric name. HELP lines may contain any sequence of UTF-8 characters (after the metric name), but the backslash and the line feed characters have to be escaped as \\ and \n, respectively. Only one HELP line may exist for any given metric name.

EDIT: My specific issue was fixed indirectly with a wildfly upgrade from 18->20, but YMMV.

donbourne commented 3 years ago

Did some investigation on this.

MP Metric's histogram and timer use summary type in prometheus output format, but are missing the _sum metric.

https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format says summary includes:

# Finally a summary, which has a complex representation, too:
# HELP rpc_duration_seconds A summary of the RPC duration in seconds.
# TYPE rpc_duration_seconds summary
rpc_duration_seconds{quantile="0.01"} 3102
rpc_duration_seconds{quantile="0.05"} 3272
rpc_duration_seconds{quantile="0.5"} 4773
rpc_duration_seconds{quantile="0.9"} 9001
rpc_duration_seconds{quantile="0.99"} 76656
rpc_duration_seconds_sum 1.7560473e+07            // missing in below representation
rpc_duration_seconds_count 2693

MP Metrics spec (section 4.2.8 / 4.2.9):

# TYPE application_file_sizes_bytes summary
# HELP application_file_sizes_bytes Users file size
application_file_sizes_bytes_count 2037
application_file_sizes_bytes{quantile="0.5"} 4201
application_file_sizes_bytes{quantile="0.75"} 6175
application_file_sizes_bytes{quantile="0.95"} 13560
application_file_sizes_bytes{quantile="0.98"} 29643
application_file_sizes_bytes{quantile="0.99"} 31716
application_file_sizes_bytes{quantile="0.999"} 31716
donbourne commented 3 years ago

Note also that MP Metrics uses only Counter, Gauge and Summary types. Prometheus exposition format has a concept of Histogram that isn't used in MP Metrics exporter (which just uses Summary and Gauge metrics for Histogram output). So given we use a subset of the Prometheus types I don't think we have any compatibility issue (since we don't introduce any types that Prometheus doesn't have).

donbourne commented 3 years ago

@trotman23 , I agree with your point about multiple HELP lines -- that's pretty clear in the text you pasted.

rmannibucau commented 3 years ago

@donbourne what about meters, timers, ...? I'm using https://github.com/eclipse/microprofile-metrics/blob/master/tck/rest/src/main/java/org/eclipse/microprofile/metrics/test/MpMetricTest.java#L510 as a reference. Assume you do a parser of the exporter string, it does not match openmetrics - or in a very surprising manner at least.

jmartisk commented 3 years ago

The thing with duplicate HELP lines was a bug in SmallRye and therefore WildFly, and is now fixed. Apparently it wasn't detected by the TCK, so we might want to improve that. I filed https://github.com/eclipse/microprofile-metrics/issues/616

donbourne commented 3 years ago

I confirmed we've already addressed the HELP line issue in Open Liberty as well.

donbourne commented 3 years ago

@rmannibucau , the MP Metrics export format for openmetrics just relies on counts, gauges and summaries -- so should be interpretable by Prometheus. Is what you're saying that MP Metrics doesn't use the prescribed histogram output for a histogram metric?

rmannibucau commented 3 years ago

@donbourne openmetrics does not define the format used by mp even if each line is compliant so a praser - potentially not prometheus since it is a spec - can fail being 100% compliant. MP should stay in the 100% defined area.

donbourne commented 3 years ago

@rmannibucau , I feel like I'm still missing something. If we're just using gauges, counters and summaries, as defined by OpenMetrics, I feel like we are being 100% compliant (once we fix the missing sum in summary). can you give example of what we're missing?

rmannibucau commented 3 years ago

@donbourne spec says "you will get A, B, C or D, E" and Mp does "you get A, B, C, D, E" which is not what spec says but includes it (the relationship is reversed IMHO, MP extended openmetrics where it should comply to it to be compatible).

Channyboy commented 2 years ago

@rmannibucau

@donbourne what about meters, timers, ...? I'm using https://github.com/eclipse/microprofile-metrics/blob/master/tck/rest/src/main/java/org/eclipse/microprofile/metrics/test/MpMetricTest.java#L510 as a reference. Assume you do a parser of the exporter string, it does not match openmetrics - or in a very surprising manner at least.

That links to JSON output.

spec says "you will get A, B, C or D, E" and Mp does "you get A, B, C, D, E"

The Prometheus Exposition format page does not seem to indicate a limitation on the type of compatible metric types to be used?

Now that the sum value has been fixed, not sure what else is uncompliant?

rmannibucau commented 2 years ago

@Channyboy get your point but it means what I'm complaining about: as a consumer you dont know what you get which is as not having a spec so if we reuse a well known format we should stick to what it enables and define and let the out of scope features stay togglable (off by default) only. So prometheus exporter shouldn't have all the not defined metric types by default at least.

Channyboy commented 2 years ago

@rmannibucau I think that is where the confusion where @donbourne and I have.

We're not sure what you mean by out of scope features (metric types)?

MP uses counter, gauge, summary. Here a the sample outputs from the MP spec ( sample out put of each metric type in the prometheus/open-metrics format): https://github.com/eclipse/microprofile-metrics/blob/master/spec/src/main/asciidoc/rest-endpoints.adoc#openmetrics-format

From https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format I'm not sure where the MP output would cause a conflict?

rmannibucau commented 2 years ago

Hmm, from master I don't see the issues I hit (and the application which was relying on that dropped mp so can't check easily anymore :s). Let's consider it fixed for now and we'll see later. Maybe it was before openmetrics was fully embrassed (during prometheus -> openmetrics migration?).