Granulate / gprofiler

gProfiler is a system-wide profiler, combining multiple sampling profilers to produce unified visualization of what your CPU is spending time on.
https://profiler.granulate.io
Apache License 2.0
743 stars 54 forks source link

perf: improve "smart" perf heuristic to ignore [unknown] frames #798

Closed Jongy closed 1 year ago

Jongy commented 1 year ago

--perf-mode=smart is a mode which runs two perfs, in DWARF and FP mode, and per-PID selects either the DWARF or FP stacks, depending on which are "deeper", under the assumption that if either DWARF or FP fail, the stack will be shallow. (Just as a note, many compiled programs are not built with FP so FP perf fails, while DWARF succeeds if debugging information is present; and Golang programs, from our experience, succeed with FP but fail with DWARF. That's the reason we have both).

The heuristic is implemented in get_average_frame_count and it's a simple check of depth. However, from what I've recently seen, DWARF stacks can be rather deep but all are [unknown]. In such cases, we don't want to account them.

  1. Fix get_average_frame_count to avoid accounting [unknown] frames in the depth check
  2. Add a test that shows that if DWARF is deeper than FP (or vice versa) but the depth is mostly [unknown]. then FP is selected (or DWARF in the opposite case).