Open mstange opened 4 months ago
I think we can fix this without knowing whether CSwitch events are supposed to have stacks.
If CSwich events have stacks, every CSwitch event should have a kernel stack. The kernel stack will have a matching timestamp for the CSwitch event.
So if we see a user stack, then all pending CSwitch events without kernel stacks should not get this user stack.
The invocation
xperf -on latency -stackwalk profile
capturesCSwitch
events but noCSwitch
stacks. The invocationxperf -on latency -stackwalk profile,cswitch
capturesCSwitch
stacks.We currently produce bad data for the former: We output off-cpu samples with on-cpu stacks, and these incorrect off-cpu samples have a lot of weight in the call tree because their weight is based on the sleep time.
This happens because of how we associate user stacks with samples: Whenever we se a user stack, we associate it with all pending user-stack-less samples and cswitch events, even if the cswitch events weren't supposed to have any stacks in the first place.
The current behavior makes sense for the case where CSwitch events do have stacks. Let's say you have the following scenario:
Here we want to have 4 stacks: sample stack, switch-off stack, switch-on stack, sample stack. For all of these, we only get one user stack, at the end when the thread leaves the kernel. So we want to associate that user stack with all those preceding pending stack-less events.
We need to know whether
CSwitch
events are supposed to have stacks so that we can avoid this over-eager association.Here are the lines of code involved in the bug:
CSwitch
event for thread wake-up creates an off-cpu sample group and puts it intopending_stacks
: https://github.com/jrmuizel/etw-profiler/blob/3ba0cc92ca6ef45db9c415997f417231541fb626/etw-gecko/src/main.rs#L822-L825"MSNT_SystemTrace/StackWalk/Stack"
handling finds off-cpu sample group inpending_stacks
and creates a long off-cpu sample (sample_count > 1
) with the on-cpu stack: https://github.com/jrmuizel/etw-profiler/blob/3ba0cc92ca6ef45db9c415997f417231541fb626/etw-gecko/src/main.rs#L491-L505