OpenLiberty / open-liberty

Open Liberty is a highly composable, fast to start, dynamic application server runtime environment
https://openliberty.io
Eclipse Public License 2.0
1.13k stars 572 forks source link

BETA BLOG - MicroProfile Metrics 3.0/4.0 for 24.0.0.6-beta #28355

Open tonyreigns opened 1 week ago

tonyreigns commented 1 week ago

The information you provide here will be included in the Open Liberty beta blog post (example), which will be published on openliberty.io/blog/, and potentially elsewhere, to promote this beta feature/function of Open Liberty. For this post to be included in the beta issue please make sure that this is completed by the end of Friday following the GM (Tuesday). The beta and release blogs are created using automation and rely on you following the template's structure. DO NOT REMOVE/ALTER THE <GHA> TAGS THROUGHOUT THIS TEMPLATE.

Please provide the following information: ​

  1. Which Liberty feature(s) does your update relate to?

    Human-readable name (eg WebSockets feature): MicroProfile Metrics 3.0 MicroProfile Metrics 4.0

    Short feature name (eg websockets-1.0):
    mpMetrics-3.0 mpMetrics-4.0

  2. Who is the target persona? Who do you expect to use the update? eg application developer, operations.
    Application developer and operations (whomever will deploy the application)

  3. Provide a summary of the update, including the following points: The upcoming MicroProfile Metrics 3.0/4.0 feature updates are backported changes from MicroProfile Metrics 5.1, which includes new MicroProfile Config properties used for configuring the statistics that are tracked and outputted by the Histogram and Timer metrics. In the prior MicroProfile Metrics 3.0/4.0 releases, Histogram and Timer metrics only tracked the min/max recorded values, the sum of all values, the count of the recorded values and a static set of percentiles for the 50th, 75th, 95th, 98th, 99th and 99.9th percentile. These values are output to the /metrics endpoint in Prometheus format.

The new properties introduced in MicroProfile Metrics 3.0/4.0 will allow users to define a custom set of percentiles as well as custom set of histogram buckets for the Histogram and Timer metrics. There are also additional configuration properties for enabling a default set of histogram buckets including properties for defining and upper and lower bound for the bucket set.

The properties above allow you to define a semi-colon separated list of value definitions that follow the below syntax:

<metric name>=<value-1>[,<value-2>…<value-n>]
Property Description
mp.metrics.distribution.percentiles Defines a custom set of percentiles for matching Histogram and Timer metrics to track and output. Accepts for a set of integer and decimal values for a metric name pairing. Can be used to disable percentile output if no value is provided with a metric name pairing.
mp.metrics.distribution.histogram.buckets Defines a custom set of (cumulative) histogram buckets for matching Histogram metrics to track and output. Accepts for a set of integer and decimal values for a metric name pairing.
mp.metrics.distribution.timer.buckets Defines a custom set of (cumulative) histogram buckets for matching Timer metrics to track and output. Accepts for a set of decimal values with a time unit appended (i.e., ms, s, m, h) for a metric name pairing.
mp.metrics.distribution.percentiles-histogram.enabled Configures any matching Histogram or Timer metric to provide a large set of default histogram buckets to allow for percentile configuration with a monitoring tool. Accepts a true/false value for a metric name pairing.
mp.metrics.distribution.histogram.max-value When percentile-histogram is enabled for a Timer, this property defines a upper bound for the buckets reported. Accepts a single integer or decimal value for a metric name pairing.
mp.metrics.distribution.histogram.min-value When percentile-histogram is enabled for a Timer, this property defines a lower bound for the buckets reported. Accepts a single integer or decimal value for a metric name pairing.
mp.metrics.distribution.timer.max-value When percentile-histogram is enabled for a Histogram, this property defines a upper bound for the buckets reported. Accepts a single decimal values with a time unit appended (i.e., ms, s, m, h) for a metric name pairing. Accepts for a single decimal value with a time unit appended (i.e., ms, s, m, h) for a metric name pairing.
mp.metrics.distribution.timer.min-value When percentile-histogram is enabled for a Histogram, this property defines a lower bound for the buckets reported. Accepts for a single decimal value with a time unit appended (i.e., ms, s, m, h) for a metric name pairing.

For example, the mp.metrics.distribution.percentiles can be defined as :

mp.metrics.distribution.percentiles=alpha.timer=0.5,0.7,0.75,0.8;alpha.histogram=0.8,0.85,0.9,0.99;delta.*=

This will create the alpha.timer timer metric to track and output the 50th, 70th, 75th and 80th percentile values. The alpha.histogram histogram metric will output the 80th, 85th, 90th and 99th percentiles values. Any Histogram or Timer metric that matches with delta.* will have its percentiles disabled.

We'll expand on the above example and define histogram buckets for the alpha.timer timer metric using the mp.metrics.distribution.timer.buckets property.

mp.metrics.distribution.timer.buckets=alpha.timer=100ms,200ms,1s

This configuration will tell the metrics runtime to track and output the count of durations that fall within 0-100ms, 0-200ms and 0-1 seconds. This is due to the histogram buckets working in a cumulative fashion.

The corresponding prometheus output for the alpha.timer metric at the /metrics REST endpoint will be:

# TYPE application_alpha_timer_mean_seconds gauge
application_alpha_timer_mean_seconds 2.9700022497975187
# TYPE application_alpha_timer_max_seconds gauge
application_alpha_timer_max_seconds 5.0
# TYPE application_alpha_timer_min_seconds gauge
application_alpha_timer_min_seconds 1.0
# TYPE application_alpha_timer_stddev_seconds gauge
application_alpha_timer_stddev_seconds 1.9997750210918204
# TYPE alpha_timer_seconds histogram <1>
application_alpha_timer_seconds_bucket{le="0.1"} 0.0 <2>
application_alpha_timer_seconds_bucket{le="0.2"} 0.0 <2>
application_alpha_timer_seconds_bucket{le="1.0"} 1.0 <2>
application_alpha_timer_seconds_bucket{le="+Inf"} 2.0 <2> <3>
application_alpha_timer_seconds_count 2
application_alpha_timer_seconds_sum 6.0
application_alpha_timer_seconds{quantile="0.5"} 1.0
application_alpha_timer_seconds{quantile="0.7"} 5.0
application_alpha_timer_seconds{quantile="0.75"} 5.0
application_alpha_timer_seconds{quantile="0.8"} 5.0
<1> The Prometheus metric type is `histogram`. Both the quantiles/percentile and buckets are represented under this type. <2> The `le` tag represents _less than_ and is for the defined buckets which are converted to seconds. <3> Prometheus requires that a `+Inf` bucket which count all hits. ## What happens next? - Add the label for the beta you're targeting: `target:YY00X-beta`. - Make sure this blog post is linked back to the Epic for this feature/function. - Your paragraph will be included in the beta blog post. It might be edited for style and consistency. - You will be asked to review a draft before publication. - Once you've approved the code review, close this issue. - If you would _also_ like to write a standalone blog post about your update (highly recommended), raise an issue on the [Open Liberty blogs repo](https://github.com/OpenLiberty/blogs/issues/new/choose). State in the issue that the blog post relates to a specific release so that we can ensure it is published on an appropriate date (it won't be the same day as the beta blog post).