Open tigrannajaryan opened 3 weeks ago
@open-telemetry/profiling-maintainers has there been any progress on this topic?
I had this action item from last meeting:
(Alexey) Maintainers - work on which changes need to be reverted and which ones do not need to be reverted
I haven't looked into it yet, I was hoping to get more comments on the compatibility doc I prepared earlier in the meantime.
I'll probably spend some time preparing a prototype PR which undoes some of the incompatibility changes in pprofextended and see how other maintainers react to it.
I will not be able to attend today's profiling SIG meeting unfortunately - the time doesn't work out for me.
Logistics-wise FYI, I do not seem to be able to assign this issue to myself, no permission apparently.
Thanks @aalexand, I assigned it to you.
All, this issue is a blocker for new PRs to the profiling proto format. We need a decision and a compatibility definition in place before we can continue making changes to the proto.
CC @open-telemetry/profiling-maintainers
I have prepared a list of items where I see the major differences between OTel Profiling and pprof (protocol & tool):
Profile.mapping: "mapping[0] will be the main binary" In whole system profiling with stacks that have multiple sources, there can not be a single main binary. https://github.com/open-telemetry/opentelemetry-proto/pull/556
Profile.location: Semantic difference between "referenced by samples via location_indices" (OTel) vs "referenced by samples" (pprof)
Profile.string_table: Special meaning of strings like "pprof::". Could be solved in semantic convention
Sample.location_index: Deprecated in OTel conflicts with pprof; OTel also uses a slightly different comment on this element
Sample.location_index: OTel name conflicts with pprof name Sample.location_id
Sample.label: Deprecated in OTel conflicts with pprof
Sample.label: OTel introduced Sample.attributes. Semantic convention should define where and when to use one over the other or both.
Sample.link: Can this new OTel element be dropped and replaced when using Sample.label with a semantic convention for span_id and trace_id?
message Label: Deprecated in OTel conflicts with pprof
Mapping.id: Deprecated in OTel conflicts with pprof
Mapping.id: pprof expects an id other than 0. Could be solved in semantic convention.
Location.id: Deprecated in OTel conflicts with pprof
Location.id: pprof expects an id other than 0. Could be solved in semantic convention.
Location.mapping_index: OTel name conflicts with pprof name Location.mapping_id
Location.type_index: Introduced by OTel and less defined. Could be solved in semantic convention.
Location.type_index: Should use int64 over uint32 to be pprof compliant
Line.function_index: OTel name conflicts with pprof name Line.function_id
Function.id: Deprecated in OTel conflicts with pprof
Function.id: pprof expects an id other than 0. Cloud be solved in semantic convention.
To resolve the list of these conflicts, a path forward could look like this:
@aalexand I'm happy to help resolving these conflicts.
We had a discussion in the Profiling SIG and believe that it is necessary to document what changes are allowed and what changes are prohibited for profiles.
@open-telemetry/profiling-maintainers please assign this to a person who will own this work. I am going to close my draft https://github.com/open-telemetry/opentelemetry-proto/pull/559 for now and will be waiting for new PR that resolves this issue.