open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.96k stars 857 forks source link

Add ability to capture non-numeric MBean attributes as metrics #12229

Closed SylvainJuge closed 3 weeks ago

SylvainJuge commented 1 month ago

For now, the current implementation of JMX Insights only allows to capture numeric MBean attributes as metrics.

However, it is common to find MBean attributes that expose internal state as a string, boolean or even enum. For example, with Tomcat Catalina:type=Connector,port=* MBean where the stateName attribute (see code) and the string values come from LifeCycleState enum (code).

The JMX Gatherer in contrib provides a way to create metrics from non-numeric MBean attributes, but replicating this with YAML seems tricky.

For string attributes, we can already capture them as a metric attribute of another metric, but it has two caveats:

Prometheus exporter has this ability, and I think it would be relevant to add it here as well. When looking at the implementation in this PR for enums, we can see that they use a "filter" on the attribute value to implement this, for example in prometheus configuration:

- pattern: 'kafka.streams<type=stream-metrics,.*client-id=(.+)><>state: RUNNING'
  name: kafka_streams_state_running
  help: Kafka stream client is in state RUNNING
  type: GAUGE
  value: 1
  labels:
    clientid: "$1"
- pattern: 'kafka.streams<type=stream-metrics,.*client-id=(.+)><>state: REBALANCING'
  name: kafka_streams_state_rebalancing
  help: Kafka stream client is in state REBALANCING
  type: GAUGE
  value: 1
  labels:
    clientid: "$1"

With this configuration:

I think there are a few solutions to implement a similar feature with YAML configuration, which in the case of Tomcat could be something like this for http.server.tomcat.connector metric:

  - bean: Catalina:type=Connector,port=*
    unit: "1"
    prefix: http.server.tomcat.
    metricAttribute:
      port: param(port)
    mapping:
      stateName:
        metric: connector
        value: 1
        type: gauge
        desc: The number of connectors in tomcat instance
        metricAttribute:
          state: beanattrmap(STARTED:started,STOPPED:stopped)

Here the value: 1 indicates the metric value will be a constant value of 1 and the beanattrmap allows to translate the original values. This beanattrmap function would support mapping string, boolean and enums through their string representation.

In short, the metric value will be constant, but the attributes of the metric will change over time.

As a simpler version of this, we could also capture the attribute values as-is, in which case the original upper-case values would be preserved with state: beanattr(state)

As another alternative, we could map using distinct metrics to make it closer to prometheus solution:

  - bean: Catalina:type=Connector,port=*
    unit: "1"
    prefix: http.server.tomcat.
    metricAttribute:
      port: param(port)
    mapping:
      stateName:
        metric: connector.started
        value: valuematch(STARTED,1,0)
        type: gauge
        desc: The number of connectors in tomcat instance
        metricAttribute:
          state: const(started)
      stateName:
        metric: connector.stopped
        value: valuematch(STOPPED,1,0)
        type: gauge
        desc: The number of connectors in tomcat instance
        metricAttribute:
          state: const(stopped)

Where the valuematch(XXX,1,0) function would return 0 value when attribute value does not match and 1 when it does.

Also, the YAML syntax does not allow for duplicate beans mapping.


I have the following questions that I'd like to have feedback on:

SylvainJuge commented 1 month ago

One of the downsides of sending a constant 1 metric value is that from the consumer side we always have to query for the metric without the "state" attribute and then check the actual value of the state attribute to see if it changes, however the metric name + attributes define the metric identity so it is not optimal.

After reflecting a bit more on this issue, I think that having the ability to define "state metrics" would be possible by trying to implement the following with Tomcat as example:

JMX object name: Catalina:type=Connector,port=*, JMX attribute name stateName

We can capture this as tomcat.connector.count metric:

Example if we have a single 8080 port and two values for stateName: STARTED and STOPPED, we have the following metrics reported

The list of values for stateName must be known in advance, otherwise we can't generate the metric breakdown for values that haven't been seen yet.

In order to normalize the values that are read from JMX and underlying implementation, we will need to have a way to express a 1:1 mapping for each value, for example:

Of course, doing that with YAML syntax could be a challenge, but it does not sound like something impossible.

trask commented 1 month ago

related: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/hardware/common.md#metric-hwstatus

SylvainJuge commented 1 month ago

I just opened a first draft implementation PR for this: https://github.com/open-telemetry/opentelemetry-java-instrumentation/pull/12369, as usual, feedback is welcome.