4d61726b / VirtualKD-Redux

VirtualKD-Redux - A revival and modernization of VirtualKD
GNU Lesser General Public License v2.1
777 stars 136 forks source link

Win11 24H2 VM Failed #63

Open MortalAndTry opened 4 months ago

MortalAndTry commented 4 months ago

After I installed the February 2024 update on my Win11 virtual machine, I found that it did not work well with VirtualKD.

After using F8 to disable driver force signature every time it is started, Vmware will pop up the following pop-up window, prompting that an error has occurred and the CPU has entered a shutdown state.

An error occurred causing the virtual CPU to enter a shutdown state. If this error occurs outside the virtual machine, it may have caused the physical machine to reboot. A misconfigured virtual machine, a bug in the guest operating system, or a problem in VMware Workstation can cause a shutdown state. Click OK to restart the virtual machine or Cancel to shut down the virtual machine

image

The key point is that my Win11 virtual machine is Win11 Insider Preview 24h2 (260581.1300)

MortalAndTry commented 4 months ago

Vmware Workstation 17.5.1

fost7777 commented 2 months ago

I have the same problem, have you solved it? VirtualKD-Redux 2024.0 & win11 24H2 & Vmware Workstation 16.2.4

4d61726b commented 2 months ago

I was able to reproduce this with Windows 11 24H2 (10.0.26100.1) and briefly looked into it. The triple fault appears to be happening before any execution is given to the custom kdcom.dll that gets installed on the guest.

I'm going to hold off on doing a full triage of the issue until we get closer to the 24H2 release so that the dust can settle. Things are changing in that version of the OS that I don't want to spend a considerable amount of time trying to figure out the problem and solution only to discover that it has to be redone or is made irrelevant between now and the release date.

4d61726b commented 2 months ago

I spent some time looking at and comparing the winload.efi from Win 11 24H2 and from Win 11 23H2. A significant amount of code has been refactored and added in the Windows boot loader for this upcoming release of Windows.

As I had mentioned before, the triple fault is happening before kdcom.dll is ever given execution. I spent some time painfully trying to determine the call stack of when the fault happens. This is obviously made difficult because I can't use a kernel debugger as support has not yet started in the guest OS.

The faulted call stack is: SIPolicyGetActiveState SIPolicyCloneActiveState BlSIPolicyLoadAndActivateTemporalPolicy OslpLoadRevocationLists OslInitializeCodeIntegrity

I believe the issue starts before this, when SIPolicyDestroySystem is being called from OslpProcessSIPolicy. This destroys the "Active State" in memory which is what later causes a __fastfail in SIPolicyGetActivateState and ultimately results in a triple fault.

A quick and easy fix to this is to just nop out SIPolicyDestroySystem. I've tested that and it works fine. I'm then able to debug with VirtualKD-Redux on Windows 11 24H2.

I'd rather not make any modifications to winload as those can be brittle or make things unpredictable when winload changes. At the moment, I'm hoping that Microsoft discovers and fixes this issue by realizing that there is a bug when you use a custom kd transport and go down this code path during code integrity initialization.

There may be another workaround as well -- Perhaps there are boot options that can be added / modified which would prevent it from taking the bad code paths. This would be ideal as it doesn't require patching winload.

If Microsoft doesn't fix this before the Windows 11 24H2 release, then I'll just patch winload from vminstall to address the bug.