4d61726b / VirtualKD-Redux

VirtualKD-Redux - A revival and modernization of VirtualKD
GNU Lesser General Public License v2.1
777 stars 136 forks source link

Unable to Interact with Guest (VMware 17.5.0, Win11 23H2) #65

Closed calware closed 2 weeks ago

calware commented 3 weeks ago

Possibly related to #63, apologies if this issue is directly addressed in your comments on that thread.

Describe the bug Attempting to follow all of the steps in the official tutorial yields a situation where vmmon64.exe connects to the guest, but attempts to break into the guest via a debugger are never honored; and no functionality enabled by vmmon64.exe works.

To Reproduce Steps to reproduce the behavior:

  1. Boot up Windows 11 guest (version 23H2, build 22631.2861)
  2. Transfer target64 to the desktop
  3. Run vminstall.exe
  4. Install with stock settings (selecting the option to patch winload additionally doesn't affect the outcome) image
  5. Restart
  6. Between restart, launch vmmon64.exe on the host (Windows 11, version 23H2, build 22631.3593)
    *************************************************************************************
    *VirtualKD-Redux patcher DLL successfully loaded. Patching the GuestRPC mechanism...*
    *************************************************************************************
    Searching patch database for information about current executable...
    No information found.
    Analyzing VMWARE-VMX executable...
    Building list of EXE sections... 28898K of data found.
    Scanning for RPC command name strings...
    Finished scanning. Found 57 strings.
    Searching for string references...
    Found 32 string references.
    Found 3 structures resemblant to RPC dispatcher table.
    (address 00007FF669817E58, matched pointers: 1)
    (address 00007FF669817E70, matched pointers: 1)
    (address 00007FF66A043AF0, matched pointers: 30)
    Analyzing potential RPC dispatcher tables...
    Potential RPC table analysis complete. Found 1 candidates.
    (address 00007FF66A042A60, entries: 115, free entries: 39)
    Using RPC dispatcher table at 0x7FF66A042A60 (115 entries)
    Waiting for RPC table to be initialized by VMWare...
    RPC table initialized. Patching it...
    Successfully patched entry #1
    VMWare reset monitor activated...
  7. Launch WinDBG/KD instance in a wait state preboot
  8. Launch VM, ensure custom boot entry is the selected option, F8, launch with Driver Signature Enforcement disabled
  9. VM launches just like normal, no faults or bugchecks, boots into guest account
  10. Debugger never attaches to the guest, and attempts to break into the guest manually (via vmmon64's "Instant Break" option) are not honored by the VM. No other debugging instance attaches, and manually attempting to attach to the pipe by name does not attach. Lastly, attempting to unpatch and re-patch the process does not work.

Expected behavior Debugger attaches to the target and is able to break into the guest to execute commands.

Screenshots image image image image image

Configuration (please complete the following information):

Additional context I made two different attempts at launching VirtualKD on the target machine, one using the option to patch winload, and one not using this option. Neither seemed to have any effect on being able to interact with the guest through usage of a debugger.

I figured I'd open this issue, as it seemed relatively unrelated to #63's opener, and nothing notable could be found in the closed issues list when searching for version numbers specific to the host/guest/VMware. Additionally, recent news articles claim Microsoft is delaying the rollout of 24H2 due to internal conflicts. The last reply to #63 on the topic of the new Windows 11 boot limitations was over six weeks ago, and I was hoping to write an article related to VirtualKD for my company's website, so I was curious if we're in a waiting pattern to see what happens with the new update, or if this issue I've raised was previously unknown.

Thanks a ton for maintaining this awesome project. VirtualKD-Redux has had the best user experience of any Windows kernel debugging tools I've put to use in my work, and I know there are countless others in my boat that feel the same way about your work. Lots of love your way from my neck of the woods!

4d61726b commented 3 weeks ago

Thank you for the detailed bug report and kind words!

I've been trying to reproduce this issue locally but have not yet had any luck. I tried with both VirtualBox and VMware Workstation. Based on your report, I originally thought maybe it was a Windows update that was introduced but even with the latest updates on both host and guest, everything still works successfully.

I'd like to try a couple things, to hopefully isolate whether it's an issue with the guest, the host, or both.

Would you be willing to create a brand new VM with an older version of Windows, such as Windows 10? Specifically, one that is also unpatched and has never had any internet connection. This would prevent the guest from downloading any updates.

Once the OS is installed, perform the steps with VKD-Redux to get everything set up. Does it work with the older OS?

calware commented 2 weeks ago

Hey thank you for the quick reply. I really do appreciate the help with getting things running.

TLDR: I got things working by discovering an issue within the boot config that was entirely my fault.

I can verify that a stock Windows 10 install does work fine with the current setup. I extended the working model to include all of the settings in the Windows 11 build, and installed the full set of Windows 10 updates + VMware tools. The extended build also functioned well with VirtualKD-Redux, and didn't appear to have any issues.

I went ahead and created another Windows 11 install which was nearly identical to the initial one, but with the exception that it was completely detached from the internet from start to finish--or, as much as Windows 11 would allow, that is. The result was successful, and everything seemed to operate as it should.

Upon comparing the initial Windows 11 machine with the new Windows 11 machine, I discovered an issue that was created long ago within the settings of my Windows 11 machine. When this machine was originally created, I was testing various means of debugging the kernel, and discovered that I could use KDNET with greater performance that I had in the past. Even though I had disabled KDNET before installing VirtualKD-Redux, KDNET still had the debugtype variable of the dbgsettings boot configuration set to NET rather than Local; the latter of which is the default setting that allows VirtualKD-Redux to function.

My apologies for this oversight in the original bug report, and my sincerest apologies for any time spent trying to resolve this edge case. We could close this issue now if you'd like. I could additionally submit a PR with this debugtype field set to Local within the VirtualKD-Redux boot configuration if you think that'd be something worth including in the project.

Regardless, thank you again for all your work on this project. I hope this finds you well and off to a happy holiday weekend.

4d61726b commented 2 weeks ago

My apologies for this oversight in the original bug report, and my sincerest apologies for any time spent trying to resolve this edge case.

It's no problem at all! Glad to hear that everything is working now.

Even though I had disabled KDNET before installing VirtualKD-Redux, KDNET still had the debugtype variable of the dbgsettings boot configuration set to NET rather than Local; the latter of which is the default setting that allows VirtualKD-Redux to function; the latter of which is the default setting that allows VirtualKD-Redux to function.

This one has me a bit puzzled for three reasons:

  1. It needs to be Serial and not Local for VKD-Redux to work. Local is one of the few types that will not work. I did a sanity check on this and Local did not work on my box.
  2. VKD-Redux should automatically set its entry to Serial on a successful run of vminstall.exe. This logic can be seen here. I did another sanity check by setting it to NET and after running vminstall.exe, it reset it back to Serial.
  3. Technically NET should work as long as VKD-Redux's entry is properly selected during boot and driver signature enforcement is disabled. VKD-Redux's kdcom.dll will still get invoked allowing for kernel debugging. I did one more sanity check and made sure this was also the case. I was able to successfully kernel debug with VKD-Redux even when the boot configuration was set to NET with an IP address and port.

If you would be willing to share the series of bcdedit commands that I can run on a fresh Windows install that would later result in vminstall.exe not being able to properly set up the guest debug environment, I would love to have them so I can fix vminstall.exe.

I was curious if we're in a waiting pattern to see what happens with the new update, or if this issue I've raised was previously unknown.

That's exactly right -- Right now, it's a "wait and see" situation with 24H2. Ideally, between now and the 24H2 release, Microsoft will fix the issue so we don't have to patch winload. If Microsoft does not fix the issue in the next few months, then I will officially release an update so that VKD-Redux works on 24H2.

I will close this ticket for now, but I will still monitor it if you have any more questions. Thanks!

calware commented 2 weeks ago

The steps to reproduce my situation are as follows:

  1. Provision the machine using the below commands, install hypervisor vendor tools, and disable firewall.
    bcdedit -set TESTSIGNING ON
    bcdedit -set HYPERVISORLAUNCHTYPE OFF
    bcdedit -set ISOLATEDCONTEXT OFF
  2. Install KDNET as described in the official installation guide.

The below describes the boot configuration after KDNET has been successfully installed and configured for network-based debugging.

Windows Boot Manager
--------------------
identifier              {bootmgr}
device                  partition=\Device\HarddiskVolume1
path                    \EFI\Microsoft\Boot\bootmgfw.efi
description             Windows Boot Manager
locale                  en-US
inherit                 {globalsettings}
default                 {current}
resumeobject            {c31aebc6-d612-11ee-8a5b-c50f92a13b0f}
displayorder            {current}
toolsdisplayorder       {memdiag}
timeout                 30

Windows Boot Loader
-------------------
identifier              {current}
device                  partition=C:
path                    \Windows\system32\winload.efi
description             Windows 11
locale                  en-US
inherit                 {bootloadersettings}
recoverysequence        {c31aebc8-d612-11ee-8a5b-c50f92a13b0f}
displaymessageoverride  Recovery
recoveryenabled         Yes
testsigning             Yes
isolatedcontext         No
allowedinmemorysettings 0x15000075
osdevice                partition=C:
systemroot              \Windows
resumeobject            {c31aebc6-d612-11ee-8a5b-c50f92a13b0f}
nx                      OptIn
bootmenupolicy          Standard
debug                   Yes

C:\Windows\System32>bcdedit /dbgsettings
busparams               3.0.0
key                     16620v4r4t45e.3e2rk00bc8nt9.85m2c5ekwf7v.38k7m47lqaxt7
debugtype               NET
hostip                  192.168.114.1
port                    50001
dhcp                    Yes
The operation completed successfully.
  1. Disable debugging temporarily via the command bcdedit /debug off, restart, install VirtualKD-Redux with default settings.

The below describes the boot configuration after restarting the guest, selecting the VirtualKD-Redux boot entry, and disabling DSE.

Windows Boot Manager
--------------------
identifier              {bootmgr}
device                  partition=\Device\HarddiskVolume1
path                    \EFI\Microsoft\Boot\bootmgfw.efi
description             Windows Boot Manager
locale                  en-US
inherit                 {globalsettings}
default                 {current}
resumeobject            {c31aebc6-d612-11ee-8a5b-c50f92a13b0f}
displayorder            {c31aebc7-d612-11ee-8a5b-c50f92a13b0f}
                        {current}
toolsdisplayorder       {memdiag}
timeout                 30

Windows Boot Loader
-------------------
identifier              {c31aebc7-d612-11ee-8a5b-c50f92a13b0f}
device                  partition=C:
path                    \Windows\system32\winload.efi
description             Windows 11
locale                  en-US
inherit                 {bootloadersettings}
recoverysequence        {c31aebc8-d612-11ee-8a5b-c50f92a13b0f}
displaymessageoverride  Recovery
recoveryenabled         Yes
testsigning             Yes
isolatedcontext         No
allowedinmemorysettings 0x15000075
osdevice                partition=C:
systemroot              \Windows
resumeobject            {c31aebc6-d612-11ee-8a5b-c50f92a13b0f}
nx                      OptIn
bootmenupolicy          Standard
debug                   No

Windows Boot Loader
-------------------
identifier              {current}
device                  partition=C:
path                    \Windows\system32\winload.efi
description             Disable Signature Enforcement Manually!!! (Press F8) [VKD-Redux]
locale                  en-US
inherit                 {bootloadersettings}
recoverysequence        {c31aebc8-d612-11ee-8a5b-c50f92a13b0f}
debugtype               Serial
displaymessageoverride  Recovery
recoveryenabled         No
nointegritychecks       Yes
testsigning             Yes
isolatedcontext         No
allowedinmemorysettings 0x15000075
osdevice                partition=C:
systemroot              \Windows
dbgtransport            kdcom.dll
resumeobject            {c31aebc6-d612-11ee-8a5b-c50f92a13b0f}
nx                      OptIn
bootmenupolicy          Legacy
custom:26000027         Yes
debug                   Yes

C:\Windows\System32>bcdedit /dbgsettings
busparams               3.0.0
key                     16620v4r4t45e.3e2rk00bc8nt9.85m2c5ekwf7v.38k7m47lqaxt7
debugtype               NET
hostip                  192.168.114.1
port                    50001
dhcp                    Yes
The operation completed successfully.

Here is a diff link so you can see the changes a little easier if it interests you.

After booting into the machine and not receiving a break into the debugger facilitated by VirtualKD-Redux, you can execute the below command to fix the issue created by the previous KDNET debugger configuration.

bcdedit -deletevalue {dbgsettings} busparams

I'm not sure if this helps, but I do hope you may be able to get something out of it if you end up pursuing the matter any further.

That's exactly right -- Right now, it's a "wait and see" situation with 24H2. Ideally, between now and the 24H2 release, Microsoft will fix the issue so we don't have to patch winload. If Microsoft does not fix the issue in the next few months, then I will officially release an update so that VKD-Redux works on 24H2.

That's very kind of you! Thank you for being so generous with your time :) We appreciate you!

4d61726b commented 2 weeks ago

Perfect, I greatly appreciate the writeup as that helped me reproduce and understand the problem.

The issue was with the "Debugger Bus Parameters" that you mentioned. I went ahead and added a fix so that VKD-Redux overrides them when its boot entry is selected. This should result in the KDNET settings staying intact unless VKD-Redux is manually selected at boot.

I went ahead and made this fix to 2024.1 Thanks again for all the information!