OpenLiberty / docs

See Open Liberty documentation on https://openliberty.io/docs/
https://openliberty.io/docs/
Other
13 stars 47 forks source link

Documentation for both OL23337 (Liberty Metrics to mpTelemetry-2.0) and OL20985 (HTTP stats + metrics) #7466

Closed Channyboy closed 2 months ago

Channyboy commented 3 months ago

OL23337 LG-337: Provide a way to send Liberty metrics to OpenTelemetry OL20985 : LG-334: Add HTTP metrics to monitor-1.0, mpMetrics and OpenTelemetry

Feature epic details

Operating systems

Does the documentation apply to all operating systems?

Summary

Provide a concise summary of your feature. What is the update, why does it matter, and to whom? What do 80% of target users need to know to be most easily productive using your runtime update?

The OL23337 (Liberty Metrics to OpenTelemetry) bridges the server stats (recorded as MBeans) with monitor-1.0 to the mpTelemetry-2.0 feature. The stats that are forwarded are only from the Sessions, Connection Pool, Request Timing and thread pool components.

The OL20985 (HTTP stats + metrics) is similar to the above feature, but targets specifically HTTP stats/metrics (as this is new functionality/component, it was a new feature). HTTP server requests are recorded as MBeans as well as bridged as metrics for mpMetrics-5.x and mpTelemetry-2.0. The metric here is to implement the http.server.request.duration metric from the Open Telemetry HTTP semantic convention at https://opentelemetry.io/docs/specs/semconv/http/http-metrics/#metric-httpserverrequestduration.

Configuration

List any new or changed properties, parameters, elements, attributes, etc. Include default values and configuration examples where relevant:

No necessary configuration updates. The two features are auto-featured and will provided metrics to mpTelemetry-2.0 (and mpMetrics-5.x) automatically.

Updates to existing topics

To update existing topics, specify a link to the topics that are affected. Include a copy of the current text and the exact text to which it will change. For example: Change ABC to XYZ

Update: Metrics Reference List

link: https://openliberty.io/docs/latest/metrics-list.html Modify the table under "MicroProfile Metrics 5.0 metrics reference"

Before:

The metrics reference tables list and describe all the metrics that are available for Open Liberty. Use metric data to effectively monitor the status of your microservice systems.

You can obtain metrics from applications, the Open Liberty runtime, and the Java virtual machine (JVM). They can be gathered and stored in database tools, such as [Prometheus](https://prometheus.io/), and displayed on dashboards, such as [Grafana](https://grafana.com/). For more information about building observability into your applications, see [Microservice observability with metrics](https://openliberty.io/docs/latest/microservice-observability-metrics.html). For more information about integrating MicroProfile Metrics 5.0 with Micrometer to send metric data to third-party monitoring systems, see [Choose your own monitoring tools with MicroProfile Metrics](https://openliberty.io/docs/latest/micrometer-metrics.html).

After:

The metrics reference tables list and describe all the metrics that are available for Open Liberty with MicroProfile Metrics and MicroProfile Telemetry (staring with mpTelemetry-2.0). Use metric data to effectively monitor the status of your microservice systems.

You can obtain metrics from applications, the Open Liberty runtime, and the Java virtual machine (JVM). They can be gathered and stored in database tools, such as [Prometheus](https://prometheus.io/), and displayed on dashboards, such as [Grafana](https://grafana.com/). For more information about building observability into your applications with MicroProfile Metrics, see [Microservice observability with metrics](https://openliberty.io/docs/latest/microservice-observability-metrics.html). For more information about integrating MicroProfile Metrics 5.0 with Micrometer to send metric data to third-party monitoring systems, see [Choose your own monitoring tools with MicroProfile Metrics](https://openliberty.io/docs/latest/micrometer-metrics.html).

^Not sure if @yasmin-aumeeruddy is adding a page about Telemetry and metrics. If so can add that in there in the end.

Before: Base and Vendor metrics <--- (sub) title

After: MicroProfile Metrics' Base and Vendor metrics

Add table entry to MP 5.0 table:

MicroProfile Metrics 5.0 name MicroProfile Metrics 5.0 Prometheus name(s) Type and desription Monitoring component Features required Version introduced
http.server.request.duration http_server_request_duration{error_type="",http_request_method="",http_response_status_code="",http_route="",mp_scope="vendor",network_protocol_version="",server_address="",server_port="",url_scheme=""} Duration of HTTP server requests. This metric is a Timer. / (seconds) HTTP MicroProfile Metrics MicroProfile Metrics 5.0
Add NEW table section for MicroProfile Telemetry 2.0 metrics. Note: Open Telemetry just sends metric data to compatible otlp recievers. Prometheus formatted metrics can be reported by the Open Telemetry collector, but I will replace that column in favor for the attributes (i.e. tags/labels) used. (Also not sure hot to format a list in the table, so I will list it with commas. NOTE NOTE: Perhaps we can have "two" tables for HTTP + JVM and the rest. This is because HTTP and JVM actually follow the Open Telemetry HTTP semantic conventions lsited for HTTP and JVM. The others are ones we created for Open liberty. MicroProfile Telemetry 2.0 name Attributes Type and description Monitoring component Features required Version introduced
http.server.request.duration - http.request.method, -url.scheme, -errory.type (conditionally required) --http.response.status_code, -http.route, -network.protocol.version, -server.address, -server.port Duration of HTTP server requests. This metric is a Histogram. / (seconds). This histogram has the following explicit bucket boundaries [ 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10 ] HTTP MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.connection_pool.handle.count io.openliberty.datasource.name="" The number of connections that are in use. This number might include multiple connections that are shared from a single managed connection. This metric is an ObservableLongUpDownCounter / ({connection_handle}) ConnectionPool MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.connection_pool.connection.created io.openliberty.datasource.name="" The total number of managed connections that were created since the pool creation. This metric is an ObservableLongCounter / ({connection_handle}) ConnectionPool MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.connection_pool.connection.destroyed io.openliberty.datasource.name="" The total number of managed connections that were destroyed since the pool creation. This metric is an ObservableLongCounter / ({connection_handle}) ConnectionPool MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.connection_pool.connection.free io.openliberty.datasource.name="" The number of managed connections that are available. This metric is an ObservableLongUpDownCounter / ({connection_handle}) ConnectionPool MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.connection_pool.connection.use_time io.openliberty.datasource.name="" The amount of time connections were used for. This metric is an DoubleHistogram / (seconds). This histogram has the following explicit bucket boundaries[ 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10] ConnectionPool MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.connection_pool.connection.count io.openliberty.datasource.name="" The current sum of managed connections in the pool. This includes managed connections that are available as well as those that are in use. A single managed connection that is shared by multiple connections only counts once. This metric is an ObservableLongUpDownCounter / ({connection_handle}) ConnectionPool MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.connection_pool.connection.wait_time io.openliberty.datasource.name="" The amount of time that connection requests waited for a connection. This metric is an DoubleHistogram / (seconds). This histogram has the following explicit bucket boundaries[ 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10] ConnectionPool MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.request_timing.active n/a The number of servlet requests that are currently running. This metric is an ObservableLongUpDownCounter / ({request}) RequestTiming MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.request_timing.hung n/a The number of servlet requests that are currently hung. This metric is an ObservableLongUpDownCounter / ({request}) RequestTiming MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.request_timing.processed n/a The number of servlet requests since the server started. This metric is an ObservableLongCounter / ({request}) RequestTiming MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.request_timing.slow n/a The number of servlet requests that are currently running but are slow. This metric is an ObservableLongUpDownCounter / ({request}) RequestTiming MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.session.active io.openliberty.app.name="" The number of concurrently active sessions. A session is considered active if the application server is processing a request that uses that user session. This metric is an ObservableLongUpDownCounter / ({session}) Session MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.session.created io.openliberty.app.name="" The number of sessions logged in since this metric was enabled. This metric is an ObservableLongCounter / ({session}) Session MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.session.invalidated io.openliberty.app.name="" The number of sessions logged out since this metric was enabled. This metric is an ObservableLongCounter / ({session}) Session MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.session.invalidated_by_timeout io.openliberty.app.name="" The number of sessions logged out because of a timeout since this metric was enabled. This metric is an ObservableLongCounter / ({session}) Session MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.session.live io.openliberty.app.name="" The number of users that are currently logged in. This metric is an ObservableLongUpDownCounter / ({session}) Session MicroProfile Telemetry MicroProfile Telemetry 2.0
io.openliberty.threadpool.active_threads io.openliberty.threadpool.name="" The number of threads that are actively running tasks. This metric is an ObservableLongUpDownCounter / ({thread}) ThreadPool MicroProfile Telemetry [MicroProfile Telemetry 2.0] (https://openliberty.io/docs/latest/reference/feature/mpMetrics-2.0.html)
io.openliberty.threadpool.size io.openliberty.threadpool.name="" The size of the thread pool. This metric is an ObservableLongUpDownCounter / ({thread}) ThreadPool MicroProfile Telemetry MicroProfile Telemetry 2.0
jvm.memory.used vm.memory.pool.name="" , -jvm.memory.type= Measure of memory used.This metric is an UpDownCounter. / (bytes) n/a MicroProfile Telemetry MicroProfile Telemetry 2.0
jvm.memory.committed vm.memory.pool.name="" , -jvm.memory.type= Measure of memory committed.This metric is an UpDownCounter. / (bytes) n/a MicroProfile Telemetry MicroProfile Telemetry 2.0
jvm.memory.limit vm.memory.pool.name="" , -jvm.memory.type= Measure of max obtainable memory.This metric is an UpDownCounter. / (bytes) n/a MicroProfile Telemetry MicroProfile Telemetry 2.0
jvm.memory.used_after_last_gc vm.memory.pool.name="" , -jvm.memory.type= Measure of memory used, as measured after the most recent garbage collection event on this pool.This metric is an UpDownCounter. / (bytes) n/a MicroProfile Telemetry MicroProfile Telemetry 2.0
jvm.gc.duration -jvm.gc.action=, -jvm.gc.name= Duration of JVM garbage collection actions. This metric is an Histogram. / (seconds). This histogram has the following explicit bucket boundaries [ 0.01, 0.1, 1, 10 ] n/a MicroProfile Telemetry MicroProfile Telemetry 2.0
jvm.thread.count -jvm.thread.daemon=, -jvm.thread.state= Number of executing platform threads.This metric is an UpDownCounter. / ({thread}) n/a MicroProfile Telemetry MicroProfile Telemetry 2.0
jvm.class.loaded n/a Number of classes loaded since JVM start. This metric is a Counter. / ({class}) n/a MicroProfile Telemetry MicroProfile Telemetry 2.0
jvm.class.unloaded n/a Number of classes unloaded since JVM start.This metric is a Counter. / ({class}) n/a MicroProfile Telemetry MicroProfile Telemetry 2.0
jvm.class.count n/a Number of classes currently loaded.This metric is an UpDownCounter. / ({class}) n/a MicroProfile Telemetry MicroProfile Telemetry 2.0
jvm.cpu.time n/a CPU time used by the process as reported by the JVM. This metric is a Counter. / (seconds) n/a MicroProfile Telemetry MicroProfile Telemetry 2.0
jvm.cpu.count n/a Number of processors available to the Java virtual machine. This metric is an UpDownCounter. / ({cpu}) n/a MicroProfile Telemetry MicroProfile Telemetry 2.0
jvm.cpu.recent_utilization n/a Recent CPU utilization for the process as reported by the JVM. This metric is a Gauge. n/a MicroProfile Telemetry MicroProfile Telemetry 2.0
-- -- -- -- -- --

Update: JMX metrics Reference list

Add section + table

HTTP server requests metric: HttpServerStats MXBean

You can use the HttpServerStats MXBean to monitor HTTP requests made to the Open Liberty server. Performance data is available for each HTTP request made to the server and each unique combination of request method, response status and HTTP route has its own MXBean.

The following attributes are available for the HttpServerStats MXBean. The object name of the MXBean for these attributes is WebSphere:type=HttpServerStats,name=*:

MXBean attribute Units Description
Duration nanoseconds The cumulative duration in nanoseconds of the requests made to this combination of request method, response status and HTTP route
Count n/a The cumulative count of requests made to this combination of request method, response status and HTTP route.
RequestMethod n/a The request method used for the request.
ResponseStatus n/a The response status of the request
HttpRoute n/a The HTTP route of the request.
Scheme n/a The URL Scheme used for the request.
NetworkProtocolName n/a The network protocol name used for the request.
NetworkProtocolVersion n/a The network protocol version used for the request.
ServerName n/a The server name the request was made to.
ServerPort n/a The server port the request was made to.
ErrorType n/a Error encountered if it exists.

Update: OpenTelemetry configuration

Similar to the "OpenTelemetry Configuration" from https://github.com/OpenLiberty/docs/issues/7459

Need to update : https://openliberty.io/docs/latest/microprofile-config-properties.html#telemetry To collect and export runtime-level metrics, enable OpenTelemetry using system properties or environment variables:

otel.sdk.disabled=false / OTEL_SDK_DISABLED=false

If you would like to separately configure multiple applications in a server, you can configure OpenTelemetry with application configuration. Note that you will not collect runtime-level metrics this way.

By default, all OpenTelemetry data is exported to OTLP. You can change each exporter with the following properties:

otel.metrics.exporter / OTEL_METRICS_EXPORTER

The metric data is exported at an interval (default of 60s). Use the following MP Config property and env var to modify the export interval. Units is in milliseconds. See Otel documentation at [Periodic exporter MetricReader(https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/#periodic-exporting-metricreader)

otel.metric.export.interval / OTEL_METRIC_EXPORT_INTERVAL

Create a new topic

To create a topic, specify a first draft of the topic that you want added and the section in the navigation where the topic should go.

dmuelle commented 3 months ago

Hi @Channyboy - I've updated the following topics, which are now available for review:

If any further updates are needed, just let me know. When you're satisfied with the drafts, you can add the technical reviewed label to this issue to sign off. Thanks!

Channyboy commented 3 months ago

@dmuelle https://docs-draft-openlibertyio.mqj6zf7jocq.us-south.codeengine.appdomain.cloud/docs/latest/metrics-list.html#telem-table The Metrics Reference list:

dmuelle commented 3 months ago

Thanks for reviewing @Channyboy - all corrections made. Let me know if anything further is needed. If not, you can sign off by adding the Technical reviewed label to this issue.

Channyboy commented 3 months ago

@dmuelle One more thing for https://docs-draft-openlibertyio.mqj6zf7jocq.us-south.codeengine.appdomain.cloud/docs/latest/metrics-list.html the <monitor filter="ConnectionPool,ThreadPool,RequestTiming,Session,WebContainer,REST,GrpcClient,GrpcServer"/> should include HTTP in the list now.

Brought up by discussion in https://github.com/OpenLiberty/docs/issues/7459