jonomango / hv

Lightweight Intel VT-x Hypervisor.
MIT License
387 stars 78 forks source link

win10 19044(21h2) BSOD #7

Closed yourapple closed 1 year ago

yourapple commented 1 year ago

as the title

jonomango commented 1 year ago

Right... is there any more information you can provide to help me fix this issue?

yourapple commented 1 year ago

I didn't test it carefully, I'm done trying it out. 19041 I have tested and it is ok.

yourapple commented 1 year ago

When i close ept everything works fine proc_based_ctrl2.enable_ept = 0;

It is estimated that there is some wrong configuration in ept.

jonomango commented 1 year ago

Just curious, how much RAM do you have? Perhaps EPT isn’t covering it all?

jonomango commented 1 year ago

Also, try removing everything after this line when testing, if you haven’t already.

yourapple commented 1 year ago

In vmware, the memory is 2G. The real machine memory is 32G. But the results are the same, all are blue screens.

yourapple commented 1 year ago

Turning off the ept will also blue screen, but it is left for a long time. If the ept is not turned off, the blue screen will be immediately. The following is the dump of turning off the ept

BugCheck 80, {4f4454, 0, 0, 0}

Probably caused by : memory_corruption

Followup: memory_corruption

nt!DbgBreakPointWithStatus: fffff800`5b00d050 cc int 3 1: kd> !analyze -v


NMI_HARDWARE_FAILURE (80) This is typically due to a hardware malfunction. The hardware supplier should be called. Arguments: Arg1: 00000000004f4454 Arg2: 0000000000000000 Arg3: 0000000000000000 Arg4: 0000000000000000

Debugging Details:

DEFAULT_BUCKET_ID: CODE_CORRUPTION

BUGCHECK_STR: 0x80

PROCESS_NAME: System

CURRENT_IRQL: f

LAST_CONTROL_TRANSFER: from fffff8005b120b12 to fffff8005b00d050

STACK_TEXT:
ffffc280b31623a8 fffff8005b120b12 : 0000000000000023 fffffc87d602cbdf 0000000000000000 fffff8005b0bedc1 : nt!DbgBreakPointWithStatus ffffc280b31623b0 fffff8005b1200f6 : 0000000000000003 ffffc280b3162510 0000000000000000 0000000000000000 : nt!KiBugCheckDebugBreak+0x12 ffffc280b3162410 fffff8005b0052b7 : ffff8903e67ce000 ffff8903e8829000 0000000000000001 ffff8903e8829720 : nt!KeBugCheck2+0x946 ffffc280b3162b20 fffff8005b0c243a : 0000000000000080 00000000004f4454 0000000000000000 0000000000000000 : nt!KeBugCheckEx+0x107 ffffc280b3162b60 fffff8005aaa15b0 : 0000000000000000 ffff8903e8829748 fffff8005b85e720 ffff8903e8829748 : nt!HalBugCheckSystem+0x7a ffffc280b3162ba0 fffff8005b1c412e : 0000000000000000 ffffc280b3162c49 ffff8903e8829748 fffff8005b85e720 : PSHED!PshedBugCheckSystem+0x10 ffffc280b3162bd0 fffff8005b0c6af2 : 0000000000000010 0000000000000010 fffff8005b85e720 000000000000005c : nt!WheaReportHwError+0x46e ffffc280b3162cb0 fffff8005b11b882 : 0000000000000001 ffffc280b3162d30 0000000000000000 fffff8005b12e130 : nt!HalHandleNMI+0x142 ffffc280b3162ce0 fffff8005b010882 : 0000000000000001 ffffc280b3162ef0 0000000000000000 0000000000000000 : nt!KiProcessNMI+0x132 ffffc280b3162d30 fffff8005b010652 : 0000000000000001 0000000000000000 0000000000000000 0000000000000000 : nt!KxNmiInterrupt+0x82 ffffc280b3162e70 fffff8005af9bc6d : 00000001027601ef ffffc280b2fea180 0000000000000000 00000106c0ce201e : nt!KiNmiInterruptStart+0x212 fffff90e1c62f070 fffff8005ae26346 : 0000000000000146 0000000000000000 0000000000000000 0000000000000000 : nt!PpmIdleGuestExecute+0x1d fffff90e1c62f0b0 fffff8005ae25104 : 0000000000000000 00001f8000000000 0000000000000003 0000000000000002 : nt!PpmIdleExecuteTransition+0x10c6 fffff90e1c62f4b0 fffff8005b008cd4 : 0000000000000000 ffffc280b2ff5140 ffff8903e6a5a040 0000000000000311 : nt!PoIdle+0x374 fffff90e1c62f620 0000000000000000 : fffff90e1c630000 fffff90e1c629000 0000000000000000 0000000000000000 : nt!KiIdleLoop+0x54

MODULE_NAME: memory_corruption

IMAGE_NAME: memory_corruption

FOLLOWUP_NAME: memory_corruption

DEBUG_FLR_IMAGE_TIMESTAMP: 0

MEMORY_CORRUPTOR: LARGE

FAILURE_BUCKET_ID: X64_MEMORY_CORRUPTION_LARGE

BUCKET_ID: X64_MEMORY_CORRUPTION_LARGE

yourapple commented 1 year ago

// 3.24.6.2 ia32_vmx_procbased_ctls2_register proc_based_ctrl2; proc_based_ctrl2.flags = 0; proc_based_ctrl2.enable_ept = 1; //proc_based_ctrl2.enable_rdtscp = 1; //proc_based_ctrl2.enable_vpid = 1; //proc_based_ctrl2.enable_invpcid = 1; //proc_based_ctrl2.enable_xsaves = 1; //proc_based_ctrl2.enable_user_wait_pause = 1; //proc_based_ctrl2.conceal_vmx_from_pt = 1; //proc_based_ctrl2.pt_uses_guest_physical_addresses = 1;

After this modification, it will be stuck. The windbg output is stuck here

[hv] Driver loaded. [hv] Allocated 4 VCPUs (0x168000 bytes). [hv] System EPROCESS = 0xFFFF8903E6078080. [hv] EPROCESS::UniqueProcessId offset = 0x440. [hv] System CR3 = 0x1AD000. [hv] Mapped all of physical memory to address 0x7F8000000000. [hv] Cached VCPU data. [hv] Enabled VMX operation. [hv] Entered VMX operation. [hv] Loaded VMCS pointer. [hv] Initialized external structures. [hv] Wrote VMCS fields.

jonomango commented 1 year ago

That is because you have those lines commented out. There are no exit-handlers setup for those instructions.

yourapple commented 1 year ago

ia32_vmx_procbased_ctls2_register proc_based_ctrl2; proc_based_ctrl2.flags = 0; proc_based_ctrl2.enable_ept = 1; proc_based_ctrl2.enable_rdtscp = 1; proc_based_ctrl2.enable_vpid = 1; proc_based_ctrl2.enable_invpcid = 1; proc_based_ctrl2.enable_xsaves = 1; proc_based_ctrl2.enable_user_wait_pause = 1; proc_based_ctrl2.conceal_vmx_from_pt = 1; //proc_based_ctrl2.pt_uses_guest_physical_addresses = 1;

After this modification, there will be no blue screen when EPT is turned on. I will put it for a while to see if it will blue screen.

But why is it normal without guest_physical?

jonomango commented 1 year ago

Is there a blue screen when you dont comment that as well? If so then that is very strange.

yourapple commented 1 year ago

Yes. Once proc_based_ctrl2.pt_uses_guest_physical_addresses = 1, the blue screen must be

yourapple commented 1 year ago

Still blue screen. After a while, it will blue screen. Blue screen code is still 0x80

jonomango commented 1 year ago

Interesting. I’ll look into this more.

yourapple commented 1 year ago

It seems to be related to the processing of nmi

STACK_TEXT:
ffffc280b348d3a8 fffff8005b120b12 : 0000000000000023 fffffc87d65c3bdf 0000000000000000 fffff8005b0bedc1 : nt!DbgBreakPointWithStatus ffffc280b348d3b0 fffff8005b1200f6 : 0000000000000003 ffffc280b348d510 0000000000000000 0000000000000000 : nt!KiBugCheckDebugBreak+0x12 ffffc280b348d410 fffff8005b0052b7 : ffff8903e67ce000 ffff8903e8829000 0000000000000001 ffff8903e8829720 : nt!KeBugCheck2+0x946 ffffc280b348db20 fffff8005b0c243a : 0000000000000080 00000000004f4454 0000000000000000 0000000000000000 : nt!KeBugCheckEx+0x107 ffffc280b348db60 fffff8005aaa15b0 : 0000000000000000 ffff8903e8829748 fffff8005b85e720 ffff8903e8829748 : nt!HalBugCheckSystem+0x7a ffffc280b348dba0 fffff8005b1c412e : 0000000000000000 ffffc280b348dc49 ffff8903e8829748 fffff8005b85e720 : PSHED!PshedBugCheckSystem+0x10 ffffc280b348dbd0 fffff8005b0c6af2 : 0000000000000010 0000000000000010 fffff8005b85e720 000000000000005c : nt!WheaReportHwError+0x46e ffffc280b348dcb0 fffff8005b11b882 : 0000000000000001 ffffc280b348dd30 0000000000000000 fffff8005b12e130 : nt!HalHandleNMI+0x142 ffffc280b348dce0 fffff8005b010882 : 0000000000000001 ffffc280b348def0 0000000000000000 0000000000000000 : nt!KiProcessNMI+0x132 ffffc280b348dd30 fffff8005b010652 : 0000000000000001 0000000000000000 0000000000000000 0000000000000000 : nt!KxNmiInterrupt+0x82 ffffc280b348de70 fffff8005af9bc6d : 00000000eb348cbe ffffc280b33d0180 0000000000000000 000000f0142e13f5 : nt!KiNmiInterruptStart+0x212 fffff90e1c64f070 fffff8005ae26346 : 00000000000002e7 0000000000000000 0000000000000000 0000000000000000 : nt!PpmIdleGuestExecute+0x1d fffff90e1c64f0b0 fffff8005ae25104 : 0000000000000001 00001f8000000000 0000000000000003 0000000000000002 : nt!PpmIdleExecuteTransition+0x10c6 fffff90e1c64f4b0 fffff8005b008cd4 : 0000000000000000 ffffc280b33db140 ffff8903ec2f5080 00000000000002b0 : nt!PoIdle+0x374 fffff90e1c64f620 0000000000000000 : fffff90e1c650000 fffff90e1c649000 0000000000000000 0000000000000000 : nt!KiIdleLoop+0x54

yourapple commented 1 year ago

ok. thank you

jonomango commented 1 year ago

Hi, I just pushed https://github.com/jonomango/hv/commit/20ef8451d3bd8575c05e75d6f8e535fedd88cfa2 which might be related to your issue. See if that fixes the BSOD. Thank you.

jonomango commented 1 year ago

Stale.