google / UIforETW

User interface for recording and managing ETW traces
https://randomascii.wordpress.com/2015/04/14/uiforetw-windows-performance-made-easier/
Apache License 2.0
1.57k stars 201 forks source link

Investigate EnableThreadProfiling to record CPU perf counters for some threads #35

Open randomascii opened 9 years ago

randomascii commented 9 years ago

https://msdn.microsoft.com/en-us/library/windows/desktop/dd796393(v=vs.85).aspx

DrChat commented 4 years ago

Pinging this issue. Looks like you've done some previous research into CPU performance counters. Nowadays it appears that WPA supports graphs for CPU performance counters (PMC Rollovers and PMC Graph in the 'Select Tables' menu). XPerf needs to be configured to enable those and it appears that UIForETW does not allow us to specify custom xperf flags.

Microsoft's PerfView supports enabling and viewing these performance counters (blog post - but the UI is (unfortunately) 100 to 1000x uglier than WPA).

Can we hijack this issue to research enabling CPU performance counter logging with UIForETW? Maybe make it a default or otherwise add it to the options menu?

Other Resources

How to collect CPU performance counters on Windows

randomascii commented 4 years ago

Interesting. It is possible to add some basic flags to request additional user-mode and system providers - see the settings dialog. However it is quite likely that that is not sufficient, or at least not convenient. I would support having a CPU performance counter mode, instead of using batch files.

Often the best thing to do is to start by experimenting with the existing batch files, and see what works, and then encode that into the UI. I suspect that a different mode (like tracing to memory, tracing to file, and heap tracing) might be appropriate, but I'm not sure.

DrChat commented 4 years ago

Looks like xperf natively supports logging PMC counters (found here):

Example invocation:

xperf -on PROC_THREAD+LOADER+pmc_profile+profile -pmcprofile InstructionRetired -f c:\home\kernel.etl

Important bits being -on pmc_profile and -pmcprofile. Additionally, to collect callstacks you'll want to specify PmcInterrupt for the stack walker flags.

randomascii commented 4 years ago

Okay, so a plausible UI would be a way to select a set of counters (a list of check boxes? free-form text?) and when one or more are selected the pmc_profile provider would be selected.

I don't know how standardized the set of counters is (query at run-time to create the list?) and how standardized the limits on the number of counters is, or the combinations. But, it all sounds very interesting.

There probably needs to be some OS version detection, and perhaps xperf version detection to gate this feature, and maybe some options should be disabled (stack walking?) to minimize distortion, or that could be left up to the user.

DrChat commented 4 years ago

Yeah - we can query the list of counters and intervals. wpr -pmcsources comes up with a list, but perhaps there's a way to get this programmatically?

Then we can have a UI with checkboxes and intervals. Perhaps another checkbox for collecting callstacks on PMC interrupts?

As for version detection - I'm not too sure when this was introduced into xperf...

randomascii commented 4 years ago

The latest releases of UIforETW guarantee that the 1903 (currently latest) version of xperf is installed on Windows 10, so xperf version detection shouldn't be needed.

I don't know know what Windows 10 OSes support this but probably if wpr -pmcsources gives us data then we're okay. And, running wpr -pmcsources and parsing the output doesn't sound like too much work. So, I think the plan would be:

if (Windows10()) { auto pmc_counters = RunAndParse("wpr -pmcsources"); if (pmc_counters.size() > 0) { PopulateAndEnableListboxInSettings(); } }

Lots of details, such as how to deal with setting the intervals, and testing to see whether sampling (call stacks) on PMC interrupts is useful. My main use-case has just been to be able to see IPC and mispredict rates by process. This would make that easier and would presumably improve the granularity to per-thread.

rajivkapoor commented 3 years ago

I would like to change the default interval for the events that show up with -pmcsources. Is there any option to change the interval for xperf or wpr? The defaults seem too low for busy CPUs - results in tons of samples and consequently dropped samples (at least for cycles and instructions retired)

randomascii commented 3 years ago

The only way that I have recorded PMC data is documented here:

https://randomascii.wordpress.com/2016/11/27/cpu-performance-counters-on-windows/

This technique records them on context switches. This makes attribution to particular pieces of code difficult, but it does give you a per-process overview.

If you learn any more then please comment here or on that blog post.

rajivkapoor commented 3 years ago

I have just started to play around with PMC data from xperf using:

xperf.exe -on PROC_THREAD+LOADER+PROFILE+DISK_IO+PMC_PROFILE -pmcprofile InstructionRetired,TotalCycles -stackwalk profile+PmcInterrupt

Works fine for a lightly loaded system but when I ran something that kept half of the cores in the system ~100% utilized - lost lots of samples.

I will try the tip from your blog. I would like to see the data at function level but per process maybe OK to start with.