open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.79k stars 782 forks source link

Attempt to align and cleanup some jmx metrics #11621

Open SylvainJuge opened 1 week ago

SylvainJuge commented 1 week ago

This is an attempt to fix a few errors and inconsistencies that I've found in the JMX metrics captured with the JMX Metric Insight feature.

I have intentionally limited the scope to tomcat, jetty and wildfly, but similar changes might be applied to other systems as a follow-up.

For the overall strategy, I agree that covering every metric of every platform is not possible nor something we aim to. For example, with Wildfly the db pool exposes more than 50 attributes that could be captured as metrics.

I think one of the important things that could make this type of mapping somehow manageable over time is to use the following strategy for metric names and their attributes:

Checklist & follow-ups

PeterF778 commented 1 week ago

Originally, JMX Metric Insight borrowed the metric definitions from JMX Metric Gatherer, and was bug-for-bug compatible. This was caused equally by our laziness as by the desire to allow users to transition smoothly to in-process metric collection. I do not know how popular JMX Metric Insight is, but I know from experience that changes to metric names/attributes can sometimes be painful for the users. Perhaps it will be helpful for the customers if we keep the old metric configuration files around for some time as tomcat_old or tomcat_legacy etc.

SylvainJuge commented 1 week ago

Originally, JMX Metric Insight borrowed the metric definitions from JMX Metric Gatherer, and was bug-for-bug compatible. This was caused equally by our laziness as by the desire to allow users to transition smoothly to in-process metric collection. I do not know how popular JMX Metric Insight is, but I know from experience that changes to metric names/attributes can sometimes be painful for the users. Perhaps it will be helpful for the customers if we keep the old metric configuration files around for some time as tomcat_old or tomcat_legacy etc.

I completely understand the duplication strategy here, but it's probably time to remove the duplication and simplify things:

Until we have such duplication removed, we will have to backport such changes in the contrib repo. Implementation-wise, a common implementation would likely reside in the contrib repo and be included in the instrumentation agent (there are already similar dependencies for the aws and gcp resource providers).

Regarding compatibility, I really don't know what should be the best approach here, all the JMX metrics are very dependent on implementation details, having any formal definition in semconv and stability status for them is not possible. Maybe keeping previous iterations of the yaml files could provide this.

SylvainJuge commented 2 days ago

Status following June 20th SIG meeting: