Velocidex / WinPmem

The multi-platform memory acquisition tool.
Apache License 2.0
698 stars 102 forks source link

BSOD (SYSTEM_THREAD_EXCEPTION_NOT_HANDLED) on windows 10 #47

Open xalicex opened 2 years ago

xalicex commented 2 years ago

Hello,

I have a BSOD immediatly after launching a memory dump on my test machine (Windows 10.0.19044). The BSOD error is SYSTEM_THREAD_EXCEPTION_NOT_HANDLED.

This is new. Last tests were in march and I didn't have any issue. I use the last release (4.0 RC2).

Are you aware of this issue ? It seems to be related to a Windows 10 update.

Thank you :)

scudette commented 2 years ago

Are you able to get a crash dump?

vivianezw commented 2 years ago

I did several tests with multiple machines and builds, even 22H2 core isolation barebone with 16 GB, (but I also tried VMs), but it justed worked. I could not reproduce this.

At least !analyze -v output would be appreciated.

yuvaleldad commented 2 years ago

Hi, not related to xalicex but also experiencing this BSOD when trying to load the winpmem driver.

0: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
This is a very common BugCheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: ffffffffc0000005, The exception code that was not handled
Arg2: fffff80437e215a6, The address that the exception occurred at
Arg3: ffffd507bcb1ef68, Exception Record Address
Arg4: ffffd507bcb1e7a0, Context Record Address

Debugging Details:
------------------

KEY_VALUES_STRING: 1

    Key  : AV.Fault
    Value: Read

    Key  : Analysis.CPU.mSec
    Value: 874

    Key  : Analysis.DebugAnalysisManager
    Value: Create

    Key  : Analysis.Elapsed.mSec
    Value: 6340

    Key  : Analysis.IO.Other.Mb
    Value: 0

    Key  : Analysis.IO.Read.Mb
    Value: 0

    Key  : Analysis.IO.Write.Mb
    Value: 0

    Key  : Analysis.Init.CPU.mSec
    Value: 359

    Key  : Analysis.Init.Elapsed.mSec
    Value: 11328

    Key  : Analysis.Memory.CommitPeak.Mb
    Value: 100

    Key  : Bugcheck.Code.DumpHeader
    Value: 0x7e

    Key  : Bugcheck.Code.KiBugCheckData
    Value: 0x7e

    Key  : Bugcheck.Code.Register
    Value: 0x7e

    Key  : WER.OS.Branch
    Value: vb_release

    Key  : WER.OS.Timestamp
    Value: 2019-12-06T14:06:00Z

    Key  : WER.OS.Version
    Value: 10.0.19041.1

FILE_IN_CAB:  MEMORY - Copy.DMP

VIRTUAL_MACHINE:  VMware

BUGCHECK_CODE:  7e

BUGCHECK_P1: ffffffffc0000005

BUGCHECK_P2: fffff80437e215a6

BUGCHECK_P3: ffffd507bcb1ef68

BUGCHECK_P4: ffffd507bcb1e7a0

EXCEPTION_RECORD:  ffffd507bcb1ef68 -- (.exr 0xffffd507bcb1ef68)
ExceptionAddress: fffff80437e215a6 (winpmem_x64+0x00000000000015a6)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: 0000000000000d68
Attempt to read from address 0000000000000d68

CONTEXT:  ffffd507bcb1e7a0 -- (.cxr 0xffffd507bcb1e7a0)
rax=0000000000000000 rbx=0000000000000d68 rcx=9f4ade56a5b90000
rdx=0000000000000001 rsi=ffffd681206ab000 rdi=ffffa70ababdef90
rip=fffff80437e215a6 rsp=ffffd507bcb1f1a0 rbp=00000000000001ad
 r8=0000000000000008  r9=0000000000000065 r10=fffff80420e0edb0
r11=ffffd507bcb1f198 r12=ffff8283ca88b4f0 r13=ffffffff800039bc
r14=ffffa70ababdefe0 r15=000ffffffffff000
iopl=0         nv up ei ng nz na pe nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00050282
winpmem_x64+0x15a6:
fffff804`37e215a6 f60301          test    byte ptr [rbx],1 ds:002b:00000000`00000d68=??
Resetting default scope

BLACKBOXBSD: 1 (!blackboxbsd)

BLACKBOXNTFS: 1 (!blackboxntfs)

BLACKBOXWINLOGON: 1

PROCESS_NAME:  System

READ_ADDRESS:  0000000000000d68 

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.

EXCEPTION_CODE_STR:  c0000005

EXCEPTION_PARAMETER1:  0000000000000000

EXCEPTION_PARAMETER2:  0000000000000d68

EXCEPTION_STR:  0xc0000005

STACK_TEXT:  
ffffd507`bcb1f1a0 fffff804`37e21950     : ffffa70a`babdef90 00000000`00000000 00000000`00000000 00000000`00000000 : winpmem_x64+0x15a6
ffffd507`bcb1f1e0 fffff804`37e2e1e4     : 00000000`00000000 ffffa70a`bbdaa750 00000000`00000000 00000000`00000000 : winpmem_x64+0x1950
ffffd507`bcb1f210 fffff804`37e21119     : 00000000`00000000 ffffa70a`bd7ec000 ffffa70a`baae4d70 ffffa70a`bbda9700 : winpmem_x64+0xe1e4
ffffd507`bcb1f290 fffff804`21363a4c     : ffffa70a`bd7ec000 ffffd507`bcb1f420 ffffa70a`bd1eac10 00000000`00000000 : winpmem_x64+0x1119
ffffd507`bcb1f2c0 fffff804`2132f1dd     : 00000000`0000000a 00000000`00000000 00000000`00000000 00000000`00001000 : nt!PnpCallDriverEntry+0x4c
ffffd507`bcb1f320 fffff804`213744e7     : 00000000`00000000 00000000`00000000 fffff804`21925440 00000000`00000000 : nt!IopLoadDriver+0x4e5
ffffd507`bcb1f4f0 fffff804`20e52b65     : ffffa70a`00000000 ffffffff`800039bc ffffa70a`b8d9f040 00000000`00000000 : nt!IopLoadUnloadDriver+0x57
ffffd507`bcb1f530 fffff804`20e71d25     : ffffa70a`b8d9f040 00000000`00000080 ffffa70a`b28a0040 001fa47f`b19bbdff : nt!ExpWorkerThread+0x105
ffffd507`bcb1f5d0 fffff804`21000778     : fffff804`1cd37180 ffffa70a`b8d9f040 fffff804`20e71cd0 d5f61afa`9bd7c613 : nt!PspSystemThreadStartup+0x55
ffffd507`bcb1f620 00000000`00000000     : ffffd507`bcb20000 ffffd507`bcb19000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x28

SYMBOL_NAME:  winpmem_x64+15a6

MODULE_NAME: winpmem_x64

IMAGE_NAME:  winpmem_x64.sys

STACK_COMMAND:  .cxr 0xffffd507bcb1e7a0 ; kb

BUCKET_ID_FUNC_OFFSET:  15a6

FAILURE_BUCKET_ID:  AV_winpmem_x64!unknown_function

OS_VERSION:  10.0.19041.1

BUILDLAB_STR:  vb_release

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

FAILURE_ID_HASH:  {7ae0da30-2181-47c4-e98b-4dfd5f0f47d3}

Followup:     MachineOwner
---------
vivianezw commented 2 years ago

Great, a read on 0xd68. At least it should be easy to find out the exact line where this happened. This was clearly a Winpmem fault.

In the meantime, feel free to check out the new version 2.0.2 in the dev branch, many bugs have been fixed. You probably need to switch testsigning on. (Note: the 2 years old drivers in the 'binaries' folder must be replaced with the freshly compiled drivers before compiling the mini tool exe.)

vivianezw commented 2 years ago

You were not using the mini tool. You loaded the driver yourself. You had an accident in the plain middle of the (unsafe) old pte initialization routine, which is, at least in the current master branch, a bad place to have an accident.

It was at line https://github.com/Velocidex/WinPmem/blob/master/kernel/pte_mmap.c#L141 . Your CR3 read might have been wrong, to start with, although a miracle why. This old routine lacked all sanity checks (probably because it was assumed that how could pml4 on cr3 be zero? Not on barebones, for sure). But knowing of all kinds of VM manipulation, I reworked it in the dev branch and you would not have an accident there. I would presume that 0xd68 was the pml4 index to add, and pml4 (pointer, from cr3 read) was null. (This is an assumption only.) You would have been able to tell that, because you had dbgprint on! The access on the resulting pml4e pointer (trying to check the present bit) killed you. In my dev branch, this accident would have been prevented. This was exactly on of the major things I reworked.

Here, sanity checks now: https://github.com/Velocidex/WinPmem/blob/f964172738fc6b8949a463412b0c601e3d33b580/kernel/pte_mmap.c#L182 For safety, every tier step now checked thoroughly, handled gracefully, and the PTE method will be disabled in DriverEntry on error. (It method needs to be set to true, and this only happens if you get through all tier steps with status success, in the DriverEntry). If any weird VM hypervisor decides to do manipulation, than Winpmem can recognize it, and avoid going into some bad trap.

Definitely try the dev branch, and try to see the dbgprints. I would like to now if the readcr3 really returned a zero pml4. This is just an assumtion, because I am missing the dbgprint output, but it would be amazing. The testing app in 'testing' should suffice, with the method set to PTE, it will do one read. Don't forget to turn on the dbgprints (you can use dbgview.exe, as admin, kernel verbose dbgprint enabled). Maybe vmware does some voodoo when driver do readcr3(). This is my best guess.