golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.88k stars 17.65k forks source link

proposal: runtime/pprof: cross system stack transitions in the heap profiler #66385

Open nsrip-dd opened 7 months ago

nsrip-dd commented 7 months ago

Proposal Details

Summary

I propose that the heap profiler cross system stack transitions in tracebacks, to be consistent with the other profilers.

The user-visible changes would be:

Background

The runtime profilers are inconsistent in how they handle system stack transitions in tracebacks. Given a sequence of calls like this:

main.main                  <--+
main.foo                      +-- User portion
runtime.bar                   |
runtime.systemstack_switch <--+

runtime.systemstack        <--+
runtime.bar.func1             +-- System portion
runtime.interestingEvent      |
runtime.recordEvent        <--+

The profilers report a traceback like so:

As a rule of thumb, I think we want the entire sequence of calls leading up to the event of interest, possibly excluding implementation details at the end of the sequence. More often than not, the user portion of the traceback is the most informative as a developer.

The heap profiler is the only one which won't show the user portion of the stack consistently. We see this in practice, for example, when starting a new goroutine requires allocating a new g. Today we'd see a traceback leading from runtime.systemstack to runtime.malg, but we wouldn't see the user portion of the call stack leading to the go statement. Note that under this proposal we wouldn't see the system stack frames after the go statement, because the heap profiler elides runtime frames from the end of tracebacks. (Source)

This is in part motivated by trying to use frame pointer unwinding for more of the runtime profilers, see https://go.dev/cl/540476. Naive frame pointer unwinding isn't going to know whether or not it's crossing the systemstack transition. Either of crossing the transition or just capturing the user portion of the call stack would be much more straightforward to match with frame pointer unwinding than only capturing the system portion.

cc @golang/runtime @prattmic

gopherbot commented 5 months ago

Change https://go.dev/cl/540476 mentions this issue: runtime: use frame pointer unwinding for the heap profiler