Open SylvainJuge opened 2 months ago
Ping @robsunday I can't yet co-assign you as you are not part of the otel contributors group.
For Tomcat, the mapping is not the same but almost equivalent, there isn't anything we need to add for 1:1 support beyond aligning the metrics themselves.
Side note: using JMX object names and attributes is a convenient way to identify elements, as it's a common part between the two mappings.
Catalina:type=Manager,host=localhost,context=*
or Tomcat:type=GlobalRequestProcessor,name=*
activeSessions
: tomcat.sessions
(no attribute) <==> http.server.tomcat.sessions.activeSessions
with context
attributeCatalina:type=GlobalRequestProcessor,name=*
or Catalina:type=GlobalRequestProcessor,name=*
name
=> proto_handler
, JMX Insight: name
=> name
errorCount
: tomcat.errors
with proto_handler
attribute <==> http.server.tomcat.errorCount
with name
attributerequestCount
: tomcat.request_count
with proto_handler
attribute <==> http.server.tomcat.requestCount
with name
attribute maxTime
: tomcat.max_time
with proto_handler
attribute <==> http.server.tomcat.maxTime
with name
attributeprocessingTime
: tomcat.processing_time
with proto_handler
attribute <==> http.server.tomcat.processingTime
with name
attributebytesReceived
: tomcat.traffic
with proto_handler
and direction
= received|sent
<==> http.server.tomcat.traffic
with name
, direction
identicalCatalina:type=ThreadPool,name=*
or Tomcat:type=ThreadPool,name=*
name
=> proto_handler
, JMX Insight: name
=> name
currentThreadCount
: tomcat.threads
with state
= idle
<==> http.server.tomcat.threads
with name
, state
identical (state=idle
reports the total number of threads, which is a bug mentioned here and here)currentThreadsBusy
: tomcat.threads
with state
= busy
<==> http.server.tomcat.threads
with name
and state
identicalGiven the mapping differences, I think here we need we probably need to leave it as-is for now.
I'll look on Jetty
For Wildfly, the mapping is also not the same but equivalent, there isn't anything we need to add for 1:1 support beyond aligning the metrics themselves.
jboss.as:deployment=*,subsystem=undertow
deployment
=> deployment
attributesessionsCreated
: wildfly.session.count
<==> wildfly.session.sessionsCreated
activeSessions
: wildfly.session.active
<==> wildfly.session.activeSessions
expiredSessions
: wildfly.session.expired
<==> wildfly.session.expiredSessions
rejectedSessions
: wildfly.session.rejected
<==> wildfly.session.rejectedSessions
jboss.as:subsystem=undertow,server=*,http-listener=*
server
=> server
attribute and http-listener
=> value of listener
requestCount
: wildfly.request.count
<==> wildfly.request.requestCount
processingTime
: wildfly.request.time
<==> wildfly.request.processingTime
errorCount
: wildfly.request.server_error
<==> wildfly.request.errorCount
bytesSent
: wildfly.network.io
with extra state
= out
attribute <==> samebytesReceived
: wildfly.network.io
with extra state
= in
attribute <==> samejboss.as:subsystem=datasources,data-source=*,statistics=pool
data-source
=> value of data_source
ActiveCount
: wildfly.jdbc.connection.open
with state
= active
<==> wildfly.db.client.connections.usage
with state
= used
IdleCount
: wildfly.jdbc.connection.open
with state
= idle
<==> wildfly.db.client.connections.usage
with state
= idle
WaitCount
: wildfly.jdbc.request.wait
<==> wildfly.db.client.connections.WaitCount
jboss.as:subsystem=transactions
numberOfTransactions
: wildfly.jdbc.transaction.count
<==> wildfly.db.client.transaction.NumberOfTransactions
numberOfSystemRollbacks
: wildfly.jdbc.rollback.count
with cause
= system
<==> wildfly.db.client.rollback.count
with cause
= system
numberOfResourceRollbacks
: wildfly.jdbc.rollback.count
with cause
= resource
<==> wildfly.db.client.rollback.count
with cause
= resource
numberOfApplicationRollbacks
: wildfly.jdbc.rollback.count
with cause
= application
<==> wildfly.db.client.rollback.count
with cause
= application
For JVM metrics, the JMX Insight does not provide a YAML file, the feature is implemented in the runtime-metrics
module of instrumentation (link). The current definition is aligned with semantic conventions for JVM metrics.
JMX Gatherer provides the following metrics that are not aligned with semconv, all of those can be easily captured with the YAML configuration:
java.lang:type=ClassLoading
:
LoadedClassCount
: jvm.classes.loaded
java.lang:type=GarbageCollector,*
:
CollectionCount
: jvm.gc.collections.count
with name
=> name
CollectionTime
: jvm.gc.collections.elapsed
with name => name
java.lang:type=Memory
HeapMemoryUsage
: jvm.memory.heap
NonHeapMemoryUsage
: jvm.memory.nonheap
java.lang:type=MemoryPool,*
Usage
: jvm.memory.pool
with name
=> name
java.lang:type=Threading
:
ThreadCount
: jvm.threads.count
As a side note, after reviewing differences for jvm
, tomcat
and wildfly
, it becomes more and more obvious to me that there are too many differences to fix. Also, the groovy definitions haven't been modified in 2 or 3 years for some, which means they are very probably obsolete or not really used in practice.
As a consequence, I think the better option for now is to:
The steps that will likely follow are:
jmxreciver
implementation to use this new way to capture JMX metricsHere are my findings regarding jetty
:
JMX: org.eclipse.jetty.server.session:context=*,type=sessionhandler,id=*
sessionsCreated
--> YAML: jetty.session.sessionsCreated
<==> Groovy: jetty.session.count
sessionTimeTotal
--> YAML: jetty.session.sessionTimeTotal
<==> Groovy: jetty.session.time.total
counter
/ Groovy: UpDownCounter
sessionTimeMax
--> YAML: jetty.session.sessionTimeMax
<==> Groovy: jetty.session.time.max
sessionTimeMean
--> YAML: jetty.session.sessionTimeMean
, not used in GroovyJMX: org.eclipse.jetty.util.thread:type=queuedthreadpool,id=*
busyThreads
--> YAML: jetty.threads.busyThreads
<==> Groovy: jetty.thread.count
with extra state=busy
attribute
updowncounter
/ Groovy: Value
idleThreads
--> YAML: jetty.threads.idleThreads
<==> Groovy: jetty.thread.count
with extra state=idle
attribute
updowncounter
/ Groovy: Value
maxThreads
--> YAML: jetty.threads.maxThreads
, not used in GroovyqueueSize
--> YAML: jetty.threads.queueSize
<==> Groovy: jetty.thread.queue.count
updowncounter
/ Groovy: Value
JMX: org.eclipse.jetty.io:context=*,type=managedselector,id=*
selectCount
--> YAML: jetty.io.selectCount
<==> Groovy: jetty.select.count
1
/ Groovy: {operations}
JMX: org.eclipse.jetty.logging:type=jettyloggerfactory,id=*
not used in Groovy
For hbase
, there isn't anything in JMX Insight for it, the mappings are simple and it should be quite straightforward (but a bit tedious) to produce an equivalent YAML to hbase.groovy
.
For hadoop
:
JMX attribute tag.Hostname
is always mapped to node_name
metric attribute in both implementations.
JMX Hadoop:service=NameNode,name=FSNamesystem
:
CapacityUsed
: hadoop.name_node.capacity.usage
<==> hadoop.capacity.CapacityUsed
CapacityTotal
: hadoop.name_node.capacity.limit
<==> hadoop.capacity.CapacityTotal
BlocksTotal
: hadoop.name_node.block.count
<==> hadoop.block.BlocksTotal
MissingBlocks
: hadoop.name_node.block.missing
<==> hadoop.block.MissingBlocks
CorruptBlocks
: hadoop.name_node.block.corrupt
<==> hadoop.block.CorruptBlocks
VolumeFailuresTotal
: hadoop.name_node.volume.failed
<==> hadoop.volume.VolumeFailuresTotal
FilesTotal
: hadoop.name_node.file.count
<==> hadoop.file.FilesTotal
TotalLoad
: hadoop.name_node.file.load
<==> hadoop.file.TotalLoad
NumLiveDataNodes
: hadoop.name_node.data_node.count
with state
= live
<==> hadoop.datenode.Count
, same state
value (yes, there is a typo in datanode
)NumDeadDataNodes
: hadoop.name_node.data_node.count
with state
= dead
<==> hadoop.hadoop.datenode.Count
, same state
valueFor cassandra
:
There is no mapping in YAML, the mapping is verbose and the lack of support for templates or string interpolation would make it quite tedious to write, but it's more an annoyance than a really blocking issue.
For example, few examples of MBeans:
org.apache.cassandra.metrics:type=ClientRequest
org.apache.cassandra.metrics:type=ClientRequest,scope=RangeSlice
org.apache.cassandra.metrics:type=ClientRequest,scope=Read
org.apache.cassandra.metrics:type=ClientRequest,scope=Write
scope=
with 3 variants by adding ,name=
with value in Unavailables
, Timeouts
or Failures
org.apache.cassandra.metrics:type=Storage,name=Load
There isn't anything that could not be mapped using YAML syntax.
For activemq
everything except property descriptions seems to be in sync.
Metric attributes are consitent.
org.apache.activemq:type=Broker,brokerName=*,destinationType=Queue,destinationName=*
and org.apache.activemq:type=Broker,brokerName=*,destinationType=Topic,destinationName=*
ProducerCount
: activemq.producer.count
<==> activemq.ProducerCount
ConsumerCount
: activemq.consumer.count
<==> activemq.ConsumerCount
MemoryPercentUsage
: activemq.memory.usage
<==> activemq.memory.MemoryPercentUsage
QueueSize
: activemq.message.current
<==> activemq.message.QueueSize
ExpiredCount
: activemq.message.expired
<==> activemq.message.ExpiredCount
EnqueueCount
: activemq.message.enqueued
<==> activemq.message.EnqueueCount
DequeueCount
: activemq.message.dequeued
<==> activemq.message.DequeueCount
AverageEnqueueTime
: activemq.message.wait_time.avg
<==> activemq.message.AverageEnqueueTime
All desc
fields in properties needs to be synchronized because wording is different
org.apache.activemq:type=Broker,brokerName=*
CurrentConnectionsCount
: activemq.connection.count
<==> activemq.connections.CurrentConnectionsCount
StorePercentUsage
: activemq.disk.store_usage
<==> activemq.disc.StorePercentUsage
TempPercentUsage
: activemq.disk.temp_usage
<==> activemq.disc.TempPercentUsage
solr
case is very similar to hbase
. No YAML at the moment but creating it should not be an issue.
For kafka
, the YAML is kafka-broker.yaml
JMX: kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec
:
Count
: kafka.message.count
JMX: kafka.server:type=BrokerTopicMetrics,name=TotalProduceRequestsPerSec
:Count
: kafka.request.count
with type
= produce
JMX: kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec
:Count
: kafka.request.count
with type
= fetch
JMX: kafka.server:type=BrokerTopicMetrics,name=FailedProduceRequestsPerSec
:Count
: kafka.request.failed
with type
= produce
JMX: kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec
:Count
: kafka.request.failed
with type
= fetch
I haven't checked in detail all the others, but they look identical between the two implementations.
I discovered that we have a way to use multiple mbeans names with the same metrics definition as seen in kafka-broker.yaml
For kafka-consumer.groovy
and kafka-producer.groovy
there is no equivalent YAML mapping though.
JMX Insights supports some values for
otel.jmx.target.system
, those are defined in YAML files here.JMX Gatherer (in contrib) supports more values of
otel.jmx.target.system
, those are defined in Groovy scripts here.While the Groovy scripts are convenient, moving to YAML seems a more future-proof solution:
Merging both implementations and bringing them to feature parity means that we have to attempt migrate/align all of the JMX Gatherer supported systems and ensure they can be implemented with YAML. Doing so will highlight any missing feature of the YAML implementation by adding any missing part.
Once the alignment is complete, we should then be able to start on the next step: building a "JMX Scraper" in contrib based on the YAML implementation in instrumentation.
For each system listed below, we need to ensure the following with JMX Insights
List of systems to cover:
Once feature parity is achieved and JMX Scraper allows to capture both:
Then we can start the next step to enhance and align the metrics as the initial attempt in https://github.com/open-telemetry/opentelemetry-java-instrumentation/pull/11621
When doing so, special care should be taken to ensure that we conform to current guidelines for metrics defined here, for example:
{noun}
instead of 1Follow-up tasks