Closed vladimir-cheverdyuk-altium closed 1 year ago
It sounds like a lot of things can cause this, but I was able to fix it by disabling real time virus protection in windows. Settings > Windows Security > Virus and Thread Protection / Manage > Real-time protection
But it actually works fine in Thread Time Stack. It is hard for me to imagine that AV can damage stacks for CPU but leave them be for Thread Time Stack. I also have suspicion that there are only one stack for both of them and of them just has extra information.
The reason that you see the thread time stacks view working but the CPU stacks not is because thread time depends upon two sets of events - CPU sampling and contextswitch/readythread events. The contextswitch/readythread events are emitted when you hit these paths in the kernel, whereas the CPU samples are emitted via profiling interrupts. It's the profiling interrupts that tend to get broken by kernel drivers, etc.
Are these on retail builds of the OS, or are these Windows Insider builds?
From TraceInfo file I can see this: OS Build Number 19041.2006.amd64fre.vb_release.191206-1406
Another PC: 22000.1.arm64fre.co_release.210604-1628
I think these are both release builds. I think we've looked at these together before. I suspect that the best course of action for these is to file a Windows Feedback ticket. This should help to get some eyes with operating system expertise to help investigate. You can do this via the key combination Windows-F.
Ok, I will leave feedback. Could you please tell me what kind terms should I use for this? I don't know many technical details and I would like to get their attention.
And just in case these are different reports from different companies.
Sure. I think the key here is that you want to point out that you have a number of traces that should be capturing CPU sampling ETW events, but those events aren't being captured. When you submit the feedback ticket, ideally submit it from a machine where this happened, so that the diagnostic information that is captured is relevant. If this is not possible, make sure to provide Windows version information. That's probably as much as you need to do until/unless they ask for more. Please also provide a link to the feedback here.
I just learned of a feature in Microsoft Defender that may cause the behavior you're seeing, as it takes the PMU from ETW. You can read more about it at https://www.microsoft.com/security/blog/2021/04/26/defending-against-cryptojacking-with-microsoft-defender-for-endpoint-and-intel-tdt/ and https://techcommunity.microsoft.com/t5/microsoft-defender-for-endpoint/defending-against-ransomware-with-microsoft-defender-for/ba-p/3243941.
The feature can be disabled by running powershell.exe Set-MpPreference -DisableTDTFeature $true
.
@rbanks54, also adding you here, as this might also be what you're seeing in #1723.
That's an interesting feature! Sounds like the ML model might need some tweaking if it's the cause 🙂
Thankfully,l I could use the work around for the conference talk I gave yesterday. Phew!
I'll give it a try on the two machines with problems later today and let you know the results
@brianrob Success!!
The setting stays after a reboot as well, though trying to re-enable it throws an error
> set-mppreference -DisableTDTFeature $false Set-MpPreference: Operation failed with the following error: 0x80004005. Operation: Set-MpPreference. Target: DisableTDTFeature.
Thank you so much for chasing things up!
Thank you @brianrob and @rbanks54. I will add this to list of instructions and hopefully it will fix that issue.
Glad to hear that we're potentially making some progress in this area. Weird that you can't re-enable it though. Definitely worth calling out to folks in any instructions. I'm going to close this issue for now, but @vladimir-cheverdyuk-altium, let me know if we need to re-open it.
@brianrob that workaround helped. Today I had a case when customer sent PerfView file with 0 CPU everywhere. I asked customer to run command above and re-run PerfView and it helped. New file has proper CPU time everywhere.
Thank you again.
Awesome @vladimir-cheverdyuk-altium! That's great to hear. Thanks for letting me know.
I got a quite a few of PerfView files that have zeroes in CPU Msec column for every process. It looks like this:![image](https://user-images.githubusercontent.com/45857320/194672519-babcdd2c-c747-41e5-9e0a-3759d63b4fee.png)
If I choose process in CPU Stacks and open, it will not work. Every tab contains only ROOT. But if I choose the same process in Thread Time Stacks and open, it will correctly show everything.
For example if I choose first process svchost I will get this CPU Stacks:![image](https://user-images.githubusercontent.com/45857320/194673051-10c49864-ce46-4876-9c3e-e22180cfbb83.png)
And if I open it in Thread Time Stacks I will see this:![image](https://user-images.githubusercontent.com/45857320/194673092-3dc3cde9-df59-4be2-915d-5a7d395ca1b1.png)
Is it correct?
I can provide .etl.zip file(s) directly to developer because it may contain some private information.
Vlad