open-telemetry / oteps

OpenTelemetry Enhancement Proposals
https://opentelemetry.io
Apache License 2.0
326 stars 157 forks source link

How to convert Java Flight Recorder (JFR) file to Profiling Data Model v2 #251

Closed yanglong1010 closed 3 months ago

yanglong1010 commented 4 months ago

I have a question about https://github.com/open-telemetry/oteps/pull/239.

The Java Flight Recorder (JFR for short) binary format can contains multiple (over 100) types of events. We can use jfr tool (like pprof command line tool) to view the events in JFR file.

jfr summary jfr.jfr

 Event Type                              Count  Size (bytes)
=============================================================
 jdk.ObjectAllocationOutsideTLAB         12239        232059
 jdk.ObjectAllocationInNewTLAB            1514         33952
 jdk.ExecutionSample                      1102         15880
 jdk.JavaMonitorWait                      1030         30284
...

In the above fragment, the jdk.ExecutionSample event type is the CPU sample, contains 1102 events, the interval between 2 consecutive events of 1 thread is 10 milliseconds. The fields for each sample are timestamp, thread, thread stack, thread state.

The jdk.ObjectAllocationInNewTLAB event type is the Allocation sample, contains 1514 events, the interval between 2 consecutive events of 1 thread is not fixed, because Java record this sample when a new TLAB (Thread Local Allocation Buffer) is created, but the TLAB size is adjusted ergonomically.

I wonder is it possible to convert a JFR file to a single ProfilesData file.

yanglong1010 commented 4 months ago

According to my understanding, it seems not possible.

message Sample {
  ...
  // The type and unit of each value is defined by the corresponding
  // entry in Profile.sample_type. All samples must have the same
  // number of values, the same as the length of Profile.sample_type.
  // When aggregating multiple samples into a single sample, the
  // result has a list of values that is the element-wise sum of the
  // lists of the originals.
  repeated int64 value = 2;
  ...

In Profiling Data Model v2, if more than 1 sample type included, all samples must have the same number of values, but this is not possible for JFR. Because in JFR, the number and interval of various events vary.

mtwo commented 3 months ago

Note from maintainer meeting: this should be moved to the spec repository or Java repository (since this is Java-specific) since profiling is still in the specification phase. @trask is asking the Profiling SIG via Slack if this is in scope for profiling (in which case it'll go to the spec repo), or if it should be Java-specific.

yanglong1010 commented 3 months ago

Thank you for your reply. I will consult this in the spec or Java repository later.

trask commented 3 months ago

hi @yanglong1010, I asked in the #otel-profiles slack channel and got this response:

my $0.02: We'll definitely want to convert a subset of JFR to the new OTel wire format, but most likely just ExecutionSample events to start with. A full JFR to OTel translation won't be possible unless/until we add a self-describing metadata mechanism to OTel profiling so we can support user defined events as JFR does. Meanwhile the OTel profiling format supports sending the JFR file as a byte[] so receivers can process it themselves.

Some 3rd party components already have a partial JFR to something else mapping, I'll likely reach out to them once the prototype for the OTel Java SDK is a bit further along and try to reach a consensus on what an 'official' OTel encoding of ExecutionSample and potentially other events may look like. Some of that may generalize to semantic conventions, or be influenced by them.

At present I envisage the profile format exporter being part of the Java SDK, but the JFR transcoder that uses it may need to be an additional jar/plugin in e.g. contrib, as reading JFR in Java needs APIs that don't exist in the older JDKs supported by the OTel SDK.

So, feels to me like the scope is at the intersection of the profiling SIG, semantic conventions work and Java SDK maintainers group.

yanglong1010 commented 3 months ago

@trask Thanks. I just joined the #otel-profiles slack channel, will continue to follow and participate in discussions there.