elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.69k stars 8.12k forks source link

OpenTelemetry JVM GC metrics not visualized #105923

Closed simitt closed 1 year ago

simitt commented 3 years ago

Kibana version: 7.14.0-BC2

Describe the bug: OpenTelemetry collected JVM garbage collection metrics are not visualized. Note that the agent.name is opentelemetry/java.

See more detailed bug report in https://github.com/elastic/apm-server/pull/5309#issuecomment-881246909

Steps to reproduce:

  1. Run apm-server
  2. Instrument a Java application with https://github.com/open-telemetry/opentelemetry-java-instrumentation, pointed at apm-server
  3. Wait for metrics to be collected and sent to APM Server
  4. Navigate to the APM app in Kibana, and check the GC usage graphs are populated

Expected behavior: GC graphs should be populated.

elasticmachine commented 3 years ago

Pinging @elastic/apm-ui (Team:apm)

dgieselaar commented 3 years ago

Related: https://github.com/elastic/kibana/issues/105915

dgieselaar commented 3 years ago

@simitt we use jvm.gc.count and jvm.gc.time fields, so without the runtime. prefix. Can you verify whether those are collected as well?

smith commented 3 years ago

The reason these aren't showing up is because the Elastic APM Java agent uses cumulative values for jvm.gc.count and jvm.gc.time, and the OpenTelemetry metrics appear to use instantaneous values.

In the APM UI we use a derivative aggregation and assume these are cumulative values: https://github.com/elastic/kibana/blob/529b155d1c97d5a773a8eed25bec545a483ee80d/x-pack/plugins/apm/server/lib/metrics/by_agent/java/gc/fetch_and_transform_gc_metrics.ts#L78-L106

Note also that values are not collected for CPU and thread count when using OpenTelemetry.

We're not going to get this into 7.14 so I'm going to move this issue to the backlog for now. We can make some adjustments on what we show on this page based on the agent name, and we might need some support in the agent or server .

axw commented 3 years ago

AFAICS the opentelemetry-java-instrumentation code is using the same method as the Elastic APM agent:

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/262feb17303544c99749e2324b68c83e5c464b85/instrumentation/runtime-metrics/library/src/main/java/io/opentelemetry/instrumentation/runtimemetrics/GarbageCollector.java#L35-L68

https://github.com/elastic/apm-agent-java/blob/c012a9b2f51943dacd55f374f43e2819c0284428/apm-agent-core/src/main/java/co/elastic/apm/agent/metrics/builtin/JvmGcMetrics.java#L43-L56

@smith what led you to believe one is instantaneous and one is cumulative? Is there some data you can point me to?

smith commented 3 years ago

@axw I think @dgieselaar mentioned this as a possibility. In this case I guess I don't have an explanation as to why the don't show up but they do not.

dgieselaar commented 3 years ago

I think that was Oliver. My only guess would be the fields are different.

On Tue, Jul 27, 2021, 05:25 Nathan L Smith @.***> wrote:

@axw https://github.com/axw I think @dgieselaar https://github.com/dgieselaar mentioned this as a possibility. In this case I guess I don't have an explanation as to why the don't show up but they do not.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/elastic/kibana/issues/105923#issuecomment-887178340, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACWDXHH66QA47CJNR36RZDTZYRLBANCNFSM5APELFBQ .

AlexanderWert commented 1 year ago

Not relevant anymore with the fix in https://github.com/elastic/kibana/pull/151826