open-telemetry / opentelemetry-proto

OpenTelemetry protocol (OTLP) specification and Protobuf definitions
https://opentelemetry.io/docs/specs/otlp/
Apache License 2.0
545 stars 242 forks source link

In pprofextended proto, consider moving Location.type_index to Mapping #561

Open aalexand opened 1 month ago

aalexand commented 1 month ago

We define it as

  // Type of frame (e.g. kernel, native, python, hotspot, php). Index into string table.
  uint32 type_index = 6;

I can't easily think of a case where the type of locations would be different for the given mapping. This could probably be moved to the mapping message which would also save some profile size since locations tend to come in much larger quantities compared to mappings.

Related: Could this type be an attribute? Both mapping and location have attributes attached.

aalexand commented 1 month ago

Perhaps one case of having different type of locations per mapping could be if they are language related. E.g. C++ and Rust linked into the same static binary.

tigrannajaryan commented 1 month ago

@open-telemetry/profiling-maintainers

christos68k commented 1 month ago

Another use case is mixed stacktraces, example: Stacktrace with both native and interpreted (Python) frames. Using a frame type per Location, we'd either keep the same Mapping for both native and interpreted Locations or not have a Mapping for interpreted Locations.

If we lift Location type to Mapping, then we'd need to introduce Mappings for these interpreted frames which seems semantically wrong.

aalexand commented 1 month ago

Another use case is mixed stacktraces, example: Stacktrace with both native and interpreted (Python) frames. Using a frame type per Location, we'd either keep the same Mapping for both native and interpreted Locations or not have a Mapping for interpreted Locations.

If we lift Location type to Mapping, then we'd need to introduce Mappings for these interpreted frames which seems semantically wrong.

I would probably expect two different "fake" mappings to be used in this case - one for interpreted locations, another for JIT locations. Having separate mappings would allow them to have different names (in the Mapping.filename field) which is useful from the tooling / analysis perspective: tools rendering profiles often expose mapping name in a tooltip on the flame graph and easily seeing from the mapping name whether the location is interpreted or JIT is convenient. pprof also takes the mapping name into account when handling its filters, so using something like -focus=Interpreted would allow to filter the data to just stacks with interpreted frames.