4d61726b / VirtualKD-Redux

VirtualKD-Redux - A revival and modernization of VirtualKD
GNU Lesser General Public License v2.1
779 stars 136 forks source link

High CPU while debugging a VM with 2 cores per processor #8

Closed repnz closed 4 years ago

repnz commented 4 years ago

Describe the bug

After I run vminstall.exe and reboot for debugging, the vmware-vmx.exe process constantly takes ~12% cpu when I stop on a breakpoint. When I resume execution, the CPU usage is not high. Also, when I resume execution the Poll Rate is around 63 - is it normal? Can you explain a bit about the Poll Rate?

I tried to to analyze the problem and then I figured out that the problem only happens when "Number Of Cores Per Processor" is not 1.

To Reproduce Steps to reproduce the behavior:

  1. Install a vm and configure "Number Of Cores Per Processor" to 2
  2. Run vminstall.exe
  3. Run bcdedit /debug on
  4. bcdedit /dbgsettings serial debugport:1 baudrate:115200
  5. Restart VM and hit ALT-DELETE (Windbg Preview)
  6. The cpu goes high.

Expected behavior I expect that CPU will be 0..

Screenshots

Not in break: image

During break:

image

Configuration (please complete the following information):

Context

*************************************************************************************
*VirtualKD-Redux patcher DLL successfully loaded. Patching the GuestRPC mechanism...*
*************************************************************************************
Searching patch database for information about current executable...
No information found.
Waiting for VMWare to initialize (5900 ms more to wait)
Analyzing VMWARE-VMX executable...
Building list of EXE sections... 21051K of data found.
Scanning for RPC command name strings...
Finished scanning. Found 56 strings.
Searching for string references...
Found 31 string references.
Found 3 structures resemblant to RPC dispatcher table.
(address 00007FF61EFF4C38, matched pointers: 1)
(address 00007FF61EFF4C48, matched pointers: 1)
(address 00007FF61F445410, matched pointers: 29)
Analyzing potential RPC dispatcher tables...
Potential RPC table analysis complete. Found 1 candidates.
(address 00007FF61F4447E0, entries: 110, free entries: 36)
Using RPC dispatcher table at 0x7FF61F4447E0 (110 entries)
Waiting for RPC table to be initialized by VMWare...
RPC table initialized. Patching it...
Successfully patched entry #1
VMWare reset monitor activated...

I tried to understand which thread in vmware-vmx is the high CPU thread and it looks like it's some code inside vmware-vmx.exe that runs in a loop calling DeviceIoCtl to vmx86.sys (looks like a function called RunVM of VCPU) - If I suspend this thread, the VM won't have high CPU (obviously) but when I try to resume the debugger the VM gets stuck, looks like it's the thread that's responsible for virtualizing one of the virtual CPUs. I then noticed that the VM was configured for 2 cores per processor. When I changed it to 1, the problem was solved.

4d61726b commented 4 years ago
  1. I'm able to reproduce this issue without VirtualKD-Redux installed in the VM and even without vmmon running. If you enable KDNET on the guest VM the vmx process will stay at a steady 12% when the debugger is broken in. Additionally, I'm able to reproduce this issue at the Windows 10 Boot Menu, without any debugging enabled. You're correct, the thread belongs to VMware Workstation spinning on DeviceIoControl calls to the virtualization driver (vmx86.sys). This is a VMware Workstation issue and not a VirtualKD-Redux issue. If you are able to suspend that thread with no repercussions, a potential solution could be to have either kdclient or vmmon find and suspend that thread when the debugger is broken into the guest machine and unsuspend when execution is to resume. This assumes that the vmx process is still in a "good state" if this thread gets suspended. Additionally, I'd want to know more information about the call that the vmx thread is spinning on before making that kind of decision.

  2. The bcdedit steps you're performing are no longer necessary. VirtualKD-Redux automatically fixes up the BCD (something that VirtualKD did not do properly on newer versions of Windows 10).

  3. The behavior you are seeing with the poll rate is normal. The poll rate just has to do with gathering statistics in a debugging session. When KdReceivePacket() is called with KdCheckForAnyPacket, it checks whether any data can be received from WinDbg and returns immediately. The poll rate represents how many times this event occurs. You can read more about this here: http://sysprogs.com/legacy/articles/kdvmware/kdcom.shtml

repnz commented 4 years ago

Thank you for your answer 👍