elastic / apm-data

apm-data holds definitions and code for manipulating Elastic APM data
Apache License 2.0
13 stars 26 forks source link

proposal: mapping of stable JVM metrics as top-level attributes #264

Closed SylvainJuge closed 1 month ago

SylvainJuge commented 6 months ago

Some of the JVM metrics are now stable and are part of the semantic conventions.

For example, the jvm.thread.count metric (semconv)

In the future, those otel metrics should be mapped as top-level attributes, but a generic solution won't be available short term.

However, we already have the need to build dashboards that rely on those stable metrics (https://github.com/elastic/kibana/issues/174445), which means that whenever we have to rely on the metrics attributes that are mapped in labels.* the dashboards won't be future-proof and will break when those will be mapped as top-level.

For example this will happen with jvm.memory.* metrics as we have to split heap/non-heap memory (semconv spec) and the breakdown is currently stored in labels.jvm_memory_type.

So here the proposal would be to store attributes as top-level attributes, but only limited to the scope of stable JVM metrics as those won't change in the future.

SylvainJuge commented 6 months ago

@elastic/obs-ds-intake-services I'd like to have your input on this idea, as it impacts the implementation choices we make for https://github.com/elastic/kibana/issues/174445

carsonip commented 6 months ago

SGTM. What's the actual work that's needed for jvm.memory.*? The semconv format is quite different from what we have now e.g. jvm.memory.non_heap.pool.used. Are the dashboards going to query both?

SylvainJuge commented 6 months ago

The metrics names themselves are not translated and are copied as-is, so it's mostly a matter of making sure that the otel attributes are written as top-level fields instead of the labels.jvm.[...].

The current UI implementation is somehow working with the 1.x version of the otel agent that relies on a previous version of the metrics. We will replace the current UI dashboard with a new one that only relies on the stable metrics.

As a follow-up improvement we could maybe translate the current format used by the 1.x java otel agent to use the new stable definition if there is a direct mapping possible.

SylvainJuge commented 6 months ago

The jvm.memory.non_heap.pool.used is the format used by the Elastic agent, and we don't translate the OTel metrics to this format as far as I know.

AlexanderWert commented 1 month ago

With the new ingestion path through OTel collector metrics will be mapped in OTel-native way