jrmuizel / etw-profiler

8 stars 4 forks source link

Misleading off-cpu stacks when profiling without CSwitch stacks #9

Open mstange opened 4 months ago

mstange commented 4 months ago

The invocation xperf -on latency -stackwalk profile captures CSwitch events but no CSwitch stacks. The invocation xperf -on latency -stackwalk profile,cswitch captures CSwitch stacks.

We currently produce bad data for the former: We output off-cpu samples with on-cpu stacks, and these incorrect off-cpu samples have a lot of weight in the call tree because their weight is based on the sleep time.

This happens because of how we associate user stacks with samples: Whenever we se a user stack, we associate it with all pending user-stack-less samples and cswitch events, even if the cswitch events weren't supposed to have any stacks in the first place.

The current behavior makes sense for the case where CSwitch events do have stacks. Let's say you have the following scenario:

thread enters kernel
thread gets sampled
thread goes to sleep
thread wakes up
thread gets sampled
thread leaves kernel

Here we want to have 4 stacks: sample stack, switch-off stack, switch-on stack, sample stack. For all of these, we only get one user stack, at the end when the thread leaves the kernel. So we want to associate that user stack with all those preceding pending stack-less events.

We need to know whether CSwitch events are supposed to have stacks so that we can avoid this over-eager association.

Here are the lines of code involved in the bug:

CSwitch event for thread wake-up creates an off-cpu sample group and puts it into pending_stacks: https://github.com/jrmuizel/etw-profiler/blob/3ba0cc92ca6ef45db9c415997f417231541fb626/etw-gecko/src/main.rs#L822-L825

"MSNT_SystemTrace/StackWalk/Stack" handling finds off-cpu sample group in pending_stacks and creates a long off-cpu sample (sample_count > 1) with the on-cpu stack: https://github.com/jrmuizel/etw-profiler/blob/3ba0cc92ca6ef45db9c415997f417231541fb626/etw-gecko/src/main.rs#L491-L505

mstange commented 2 months ago

I think we can fix this without knowing whether CSwitch events are supposed to have stacks.

If CSwich events have stacks, every CSwitch event should have a kernel stack. The kernel stack will have a matching timestamp for the CSwitch event.

So if we see a user stack, then all pending CSwitch events without kernel stacks should not get this user stack.