Closed Shadowairing closed 1 year ago
Hard to say. What's your CPU? How many cores? One way of debugging would be to load hv.sys
with something like OSR Loader (not manually mapped), wait for the blue screen, then send the MEMORY.dump file that Windows spits out.
My CPU is i9-11980HK with 16 logical processors, I signed and loaded the driver normally.
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
CLOCK_WATCHDOG_TIMEOUT (101)
An expected clock interrupt was not received on a secondary processor in an
MP system within the allocated interval. This indicates that the specified
processor is hung and not processing interrupts.
Arguments:
Arg1: 000000000000000c, Clock interrupt time out interval in nominal clock ticks.
Arg2: 0000000000000000, 0.
Arg3: fffff800529bd180, The PRCB address of the hung processor.
Arg4: 0000000000000000, The index of the hung processor.
Debugging Details:
------------------
KEY_VALUES_STRING: 1
Key : Analysis.CPU.mSec
Value: 3452
Key : Analysis.DebugAnalysisManager
Value: Create
Key : Analysis.Elapsed.mSec
Value: 18590
Key : Analysis.Init.CPU.mSec
Value: 2140
Key : Analysis.Init.Elapsed.mSec
Value: 27430
Key : Analysis.Memory.CommitPeak.Mb
Value: 115
Key : WER.OS.Branch
Value: vb_release
Key : WER.OS.Timestamp
Value: 2019-12-06T14:06:00Z
Key : WER.OS.Version
Value: 10.0.19041.1
FILE_IN_CAB: 061723-8765-01.dmp
BUGCHECK_CODE: 101
BUGCHECK_P1: c
BUGCHECK_P2: 0
BUGCHECK_P3: fffff800529bd180
BUGCHECK_P4: 0
FAULTING_PROCESSOR: 0
BLACKBOXBSD: 1 (!blackboxbsd)
BLACKBOXNTFS: 1 (!blackboxntfs)
BLACKBOXPNP: 1 (!blackboxpnp)
BLACKBOXWINLOGON: 1
CUSTOMER_CRASH_COUNT: 1
PROCESS_NAME: svchost.exe
STACK_TEXT:
ffffad81`afc85c88 fffff800`56c3ad32 : 00000000`00000101 00000000`0000000c 00000000`00000000 fffff800`529bd180 : nt!KeBugCheckEx
ffffad81`afc85c90 fffff800`56a7541d : 00000000`00000000 ffffad81`afc33180 00000000`00000246 00000000`00001329 : nt!KeAccumulateTicks+0x1c8b32
ffffad81`afc85cf0 fffff800`56a759c1 : 00000000`00001100 00000000`00000b6b ffffad81`afc33180 00000000`00000001 : nt!KiUpdateRunTime+0x5d
ffffad81`afc85d40 fffff800`56a6f833 : ffffad81`afc33180 00000000`00000000 fffff800`574319d8 00000000`00000000 : nt!KiUpdateTime+0x4a1
ffffad81`afc85e80 fffff800`56a781f2 : ffffac85`499ae7f0 ffffac85`499ae870 ffffac85`499ae800 00000000`0000000c : nt!KeClockInterruptNotify+0x2e3
ffffad81`afc85f30 fffff800`56b27f55 : 00000000`2dc56ad8 ffffd803`159235a0 ffffd803`15923650 00000000`00000000 : nt!HalpTimerClockInterrupt+0xe2
ffffad81`afc85f60 fffff800`56bf78ea : ffffac85`499ae870 ffffd803`159235a0 00000000`00000001 00000000`00000000 : nt!KiCallInterruptServiceRoutine+0xa5
ffffad81`afc85fb0 fffff800`56bf7e57 : 00000000`0b47773d ffffad81`afc33180 00000000`00000002 ffffad81`afc36130 : nt!KiInterruptSubDispatchNoLockNoEtw+0xfa
ffffac85`499ae7f0 fffff800`56a93680 : 00000000`00000000 00000000`00000000 00000000`00000002 ffffd803`1b810000 : nt!KiInterruptDispatchNoLockNoEtw+0x37
ffffac85`499ae980 fffff800`56a93498 : 00000000`00000000 fffff97c`00000000 ffffd803`4690d400 fffff800`56a2f08a : nt!KeFlushMultipleRangeTb+0x160
ffffac85`499aea20 fffff800`56abc25e : ffffa800`2fa79e00 8100000f`e28a0921 00000000`00000000 00000000`00000004 : nt!MiFlushTbList+0x88
ffffac85`499aea50 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!MmSetAddressRangeModifiedEx+0x2ae
SYMBOL_NAME: nt!KeAccumulateTicks+1c8b32
MODULE_NAME: nt
IMAGE_NAME: ntkrnlmp.exe
IMAGE_VERSION: 10.0.19041.928
STACK_COMMAND: .cxr; .ecxr ; kb
BUCKET_ID_FUNC_OFFSET: 1c8b32
FAILURE_BUCKET_ID: CLOCK_WATCHDOG_TIMEOUT_INVALID_CONTEXT_nt!KeAccumulateTicks
OS_VERSION: 10.0.19041.1
BUILDLAB_STR: vb_release
OSPLATFORM_TYPE: x64
OSNAME: Windows 10
FAILURE_ID_HASH: {95498f51-33a9-903b-59e5-d236937d8ecf}
Followup: MachineOwner
I'm really not sure... The fact that it bluescreens occasionally even without the hypervisor loaded, and the fact that it works fine on your other physical machine, leads me to believe that it is probably something else that is the cause of this. Maybe a faulty driver? Again, I'm not sure.
Although, if we do assume that hv
is causing the blue screen, then I would probably think that it's something to do with the TSC hiding code.
I fixed that. I reduced the RAM and eventually it worked fine.
The computer should have less than 64GB of RAM.
I increased the RAM of the previously working physical machine to 64GB and then the same BSOD (CLOCK_WATCHDOG_TIMEOUT) happened.
I fixed that. I reduced the RAM and eventually it worked fine.
The computer should have less than 64GB of RAM.
I increased the RAM of the previously working physical machine to 64GB and then the same BSOD (CLOCK_WATCHDOG_TIMEOUT) happened.
Ah okay. Maybe increase this?
Now I increased that and the problem is solved, thank you so much.
I did not make any code changes except for ept_pd_count = 512
The code works fine on the virtual machine and one physical machine
But for another physical machine, I can't call "hv::for_each_cpu([]() {hv::test();});" from um.exe, which causes BSOD
They have the same Windows Version 20H2.
I think there might be a issue with my CPU, because it happened two times(CLOCK_WATCHDOG_TIMEOUT) when I didn't have hv on.
Could it be that hv executed some instructions that caused the problem to recur OR there're some bugs in hv? I have no idea.