microsoft / WSL

Issues found on WSL
https://docs.microsoft.com/windows/wsl
MIT License
17.44k stars 822 forks source link

Deleting large amount of data (400 GB) causes blue screen with KERNEL_SECURITY_CHECK_FAILURE in vmbkmclr!InCompletePacket #7335

Open surban opened 3 years ago

surban commented 3 years ago

Windows Build Number

Microsoft Windows [Version 10.0.22000.160]

WSL Version

Kernel Version

5.10.43.3-microsoft-standard-WSL2

Distro Version

Ubuntu 20.04

Other Software

No response

Repro Steps

  1. Create 400 GB of data in WSL2. (Average file size was 100 MB)
  2. rm -rf the directory containing the data.
  3. Crash.

This was on a system with a fast NVMe SSD.

Expected Behavior

No crash.

Actual Behavior

System bluescreen.

Diagnostic Logs

WinDbg crash analysis:

crash.txt

benhillis commented 3 years ago

This sounds familiar, I've started a thread with the storage folks (a peer team of mine).

surban commented 2 years ago

This is still happening with build 22504.1000.

*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
This is a very common BugCheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: ffffffffc0000005, The exception code that was not handled
Arg2: fffff80334162cb3, The address that the exception occurred at
Arg3: ffffa88cfa47f208, Exception Record Address
Arg4: ffffa88cfa47ea20, Context Record Address

Debugging Details:
------------------

Page ffc9f0 not present in the dump file. Type ".hh dbgerr004" for details

KEY_VALUES_STRING: 1

    Key  : AV.Dereference
    Value: String

    Key  : AV.Fault
    Value: Read

    Key  : Analysis.CPU.mSec
    Value: 1671

    Key  : Analysis.DebugAnalysisManager
    Value: Create

    Key  : Analysis.Elapsed.mSec
    Value: 5961

    Key  : Analysis.Init.CPU.mSec
    Value: 530

    Key  : Analysis.Init.Elapsed.mSec
    Value: 35710

    Key  : Analysis.Memory.CommitPeak.Mb
    Value: 105

    Key  : WER.OS.Branch
    Value: rs_prerelease

    Key  : WER.OS.Timestamp
    Value: 2021-11-12T16:50:00Z

    Key  : WER.OS.Version
    Value: 10.0.22504.1000

FILE_IN_CAB:  MEMORY.DMP

TAG_NOT_DEFINED_202b:  *** Unknown TAG in analysis list 202b

DUMP_FILE_ATTRIBUTES: 0x1800

BUGCHECK_CODE:  7e

BUGCHECK_P1: ffffffffc0000005

BUGCHECK_P2: fffff80334162cb3

BUGCHECK_P3: ffffa88cfa47f208

BUGCHECK_P4: ffffa88cfa47ea20

EXCEPTION_RECORD:  ffffa88cfa47f208 -- (.exr 0xffffa88cfa47f208)
ExceptionAddress: fffff80334162cb3 (vmbkmclr!InCompletePacket+0x00000000000004a3)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: 0000004763736564
Attempt to read from address 0000004763736564

CONTEXT:  ffffa88cfa47ea20 -- (.cxr 0xffffa88cfa47ea20)
rax=0000000041f85ed3 rbx=ffffd20d72dfa000 rcx=0000007f63736564
rdx=ffffd20d72dfa0a0 rsi=0000000000000000 rdi=0000004763736564
rip=fffff80334162cb3 rsp=ffffa88cfa47f440 rbp=0000000000000040
 r8=ffffd20d72dfa3bc  r9=0000000000000040 r10=fffff803341627e0
r11=ffff857c84e00000 r12=ffffd20d72dfa3bc r13=0000000000000004
r14=ffffd20d7cc3b060 r15=ffffd20d72dfa610
iopl=0         nv up ei pl nz na po nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00050206
vmbkmclr!InCompletePacket+0x4a3:
fffff803`34162cb3 488b0f          mov     rcx,qword ptr [rdi] ds:002b:00000047`63736564=????????????????
Resetting default scope

BLACKBOXBSD: 1 (!blackboxbsd)

BLACKBOXNTFS: 1 (!blackboxntfs)

BLACKBOXPNP: 1 (!blackboxpnp)

BLACKBOXWINLOGON: 1

PROCESS_NAME:  System

READ_ADDRESS:  0000004763736564 

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.

EXCEPTION_CODE_STR:  c0000005

EXCEPTION_PARAMETER1:  0000000000000000

EXCEPTION_PARAMETER2:  0000004763736564

EXCEPTION_STR:  0xc0000005

STACK_TEXT:  
ffffa88c`fa47f440 fffff803`341627f5     : 00000000`00000004 ffffd20d`72dfa5b0 ffffd20d`72dfa3bc ffffa88c`00000040 : vmbkmclr!InCompletePacket+0x4a3
ffffa88c`fa47f4f0 fffff803`4599262f     : 00000000`00000000 00000000`236e40a6 ffffd20d`7c90ac80 fffff803`2d126d5b : vmbkmclr!VmbChannelPacketComplete+0x15
ffffa88c`fa47f520 fffff803`45992309     : ffffa88c`00000001 00000000`00000000 00000000`00000000 00000000`00000018 : storvsp!VspRequestComplete+0x30f
ffffa88c`fa47f620 fffff803`34dc133e     : ffffa88c`fa47f6b8 00000000`00000004 ffffd20d`7dc2b5d0 ffffa88c`fa47f729 : storvsp!VstorCompleteScsiRequest+0x9
ffffa88c`fa47f650 fffff803`32e46a33     : ffffd20d`72dfa490 ffffa88c`fa47f729 ffffd20d`72dfa420 ffffd20d`7e446080 : vhdparser!NVhdIoParserEndIo+0x7e
ffffa88c`fa47f680 fffff803`32e46535     : ffffd20d`7e404300 ffffd20d`7c818500 ffffa88c`fa47f8f8 00000000`00000000 : vhdmp!VhdmpiCompleteParserRequest+0x4e3
ffffa88c`fa47f790 fffff803`32e4648d     : ffffd20d`7e4038c0 ffffd20d`72dfa680 ffffd20d`7e4043d0 ffffd20d`7c818570 : vhdmp!VhdmpiDecrementIoRefCountSrbExtension+0x25
ffffa88c`fa47f7c0 fffff803`32e46386     : ffffd20d`7e4038c0 00000000`00000000 ffffa88c`fa47f8f8 ffffd20d`7e4038c0 : vhdmp!VhdmpiSrbPartContinueComplete+0xad
ffffa88c`fa47f800 fffff803`32e4c2dd     : ffffa88c`fa47f8f8 ffffa88c`fa47f930 ffffd20d`7e4038c0 ffffd20d`72dfa6f0 : vhdmp!VhdmpiVhd2SrbRangeComplete+0x96
ffffa88c`fa47f840 fffff803`32e4b919     : ffffa88c`fa47f8f8 ffffd20d`7e4038c0 ffffd20d`7e4038c0 ffffd20d`7c772040 : vhdmp!Vhd2iAeCompletionTrampoline+0xd
ffffa88c`fa47f870 fffff803`32e4b7ac     : ffffd20d`7e4038c0 00000000`00000000 ffffd20d`7c772180 00000000`00000000 : vhdmp!AeProcessTodo+0x79
ffffa88c`fa47f8c0 fffff803`32e4bc7a     : ffffd20d`7c772040 ffffd20d`4dcbec90 fffff803`2db49640 00000000`00000100 : vhdmp!AeiDelayedCompletionWorkerRoutine+0xbc
ffffa88c`fa47f950 fffff803`32e4d8da     : 00000000`00000100 fffff803`2db49640 ffffd20d`4dcbec90 fffff803`319e6328 : vhdmp!VhdmpiVhd2AeWorkItemCallback+0x2a
ffffa88c`fa47f990 fffff803`2d09992f     : ffffd20d`4dcbec90 ffffd20d`7c772000 ffffd20d`7d048100 ffffd20d`00000000 : vhdmp!VhdmpiAeWorkerRoutine+0x2a
ffffa88c`fa47f9c0 fffff803`2d00e035     : ffffd20d`7c772040 ffff9580`6ada6000 ffffd20d`7c772040 004fe067`b8bbbdff : nt!ExpWorkerThread+0x14f
ffffa88c`fa47fbb0 fffff803`2d22a464     : ffff9580`6ad97180 ffffd20d`7c772040 fffff803`2d00dfe0 ff242222`ff242222 : nt!PspSystemThreadStartup+0x55
ffffa88c`fa47fc00 00000000`00000000     : ffffa88c`fa480000 ffffa88c`fa479000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x34

SYMBOL_NAME:  vmbkmclr!InCompletePacket+4a3

MODULE_NAME: vmbkmclr

IMAGE_NAME:  vmbkmclr.sys

STACK_COMMAND:  .cxr 0xffffa88cfa47ea20 ; kb

BUCKET_ID_FUNC_OFFSET:  4a3

FAILURE_BUCKET_ID:  AV_vmbkmclr!InCompletePacket

OS_VERSION:  10.0.22504.1000

BUILDLAB_STR:  rs_prerelease

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

FAILURE_ID_HASH:  {46630530-b660-f529-786c-319458aefd15}

Followup:     MachineOwner
---------
DanPinGF commented 2 years ago

It appears you're experiencing two different stop codes: 0x139 for a critical data structure corruption and 0x7E for an unhandled system thread exception. It may help if you boot into Safe Mode.

surban commented 2 years ago

It appears you're experiencing two different stop codes: 0x139 for a critical data structure corruption and 0x7E for an unhandled system thread exception. It may help if you boot into Safe Mode.

It's clearly not a driver problem but a bug in VHD handling. How is safe mode supposed to help?

DanPinGF commented 2 years ago

It's clearly not a driver problem but a bug in VHD handling. How is safe mode supposed to help?

Safe Mode only loads a limited set of drivers and components, particularly Windows-default drivers, so no third-party driver will be loaded when booting in that Mode.

surban commented 2 years ago

It's clearly not a driver problem but a bug in VHD handling. How is safe mode supposed to help?

Safe Mode only loads a limited set of drivers and components, particularly Windows-default drivers, so no third-party driver will be loaded when booting in that Mode.

Yes, I know that. Are you from Microsoft?

DanPinGF commented 2 years ago

No, but I do know a bit about these features. You could use it to enhance the troubleshooting process. But if that doesn't help. then I'd probably boot into WinRE and use the command prompt from there.

surban commented 2 years ago

This is still not fixed in build 22593.1.

*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
This is a very common BugCheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: ffffffffc0000005, The exception code that was not handled
Arg2: fffff8048663a57c, The address that the exception occurred at
Arg3: ffff808658167228, Exception Record Address
Arg4: ffff990169fc6900, Context Record Address

Debugging Details:
------------------

KEY_VALUES_STRING: 1

    Key  : AV.Fault
    Value: Read

    Key  : Analysis.CPU.mSec
    Value: 1327

    Key  : Analysis.DebugAnalysisManager
    Value: Create

    Key  : Analysis.Elapsed.mSec
    Value: 6070

    Key  : Analysis.Init.CPU.mSec
    Value: 186

    Key  : Analysis.Init.Elapsed.mSec
    Value: 29987

    Key  : Analysis.Memory.CommitPeak.Mb
    Value: 117

    Key  : WER.OS.Branch
    Value: ni_release

    Key  : WER.OS.Timestamp
    Value: 2022-04-02T11:00:00Z

    Key  : WER.OS.Version
    Value: 10.0.22593.1

FILE_IN_CAB:  MEMORY.DMP

TAG_NOT_DEFINED_202b:  *** Unknown TAG in analysis list 202b

DUMP_FILE_ATTRIBUTES: 0x1800

BUGCHECK_CODE:  7e

BUGCHECK_P1: ffffffffc0000005

BUGCHECK_P2: fffff8048663a57c

BUGCHECK_P3: ffff808658167228

BUGCHECK_P4: ffff990169fc6900

EXCEPTION_RECORD:  ffff808658167228 -- (.exr 0xffff808658167228)
ExceptionAddress: fffff8048663a57c (nt!RtlpHpSegPageRangeComputeLargePageCost+0x000000000000004c)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: ffffffffffffffff
Attempt to read from address ffffffffffffffff

CONTEXT:  ffff990169fc6900 -- (.cxr 0xffff990169fc6900)
rax=66a788df7de66b16 rbx=0000000000000000 rcx=0000000000000000
rdx=0000000000000001 rsi=0000000000000001 rdi=00000000000007ff
rip=fffff8048663a57c rsp=ffff808658167460 rbp=ffffe708c8010140
 r8=0000000000004000  r9=0000000000000000 r10=66a788df7de66b16
r11=0000000000004000 r12=ffffe7091cc00dc0 r13=000000000000006c
r14=ffffe7091cc00040 r15=0000000000000002
iopl=0         nv up ei pl zr na po nc
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00050246
nt!RtlpHpSegPageRangeComputeLargePageCost+0x4c:
fffff804`8663a57c 440fb700        movzx   r8d,word ptr [rax] ds:002b:66a788df`7de66b16=????
Resetting default scope

BLACKBOXBSD: 1 (!blackboxbsd)

BLACKBOXNTFS: 1 (!blackboxntfs)

BLACKBOXPNP: 1 (!blackboxpnp)

BLACKBOXWINLOGON: 1

PROCESS_NAME:  System

READ_ADDRESS:  ffffffffffffffff 

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.

EXCEPTION_CODE_STR:  c0000005

EXCEPTION_PARAMETER1:  0000000000000000

EXCEPTION_PARAMETER2:  ffffffffffffffff

EXCEPTION_STR:  0xc0000005

STACK_TEXT:  
ffff8086`58167460 fffff804`86634eea     : ffffe709`1cc00dc0 ffffe709`1cc00040 00000000`0000006c ffffffff`ffffffff : nt!RtlpHpSegPageRangeComputeLargePageCost+0x4c
ffff8086`58167470 fffff804`86635e6a     : 00000000`0000006b ffffe708`c8010140 00000000`00000001 00000000`00000000 : nt!RtlpHpSegFreeRangeInsert+0x6a
ffff8086`581674a0 fffff804`86635a5a     : 00000000`00000000 00000000`00000000 ffffe709`00000000 00000000`00000002 : nt!RtlpHpSegPageRangeAllocate+0x2ca
ffff8086`58167500 fffff804`8665ed0a     : 00000000`00000002 00000000`0006c000 ffffe709`18ad83b0 ffffffff`ffffffff : nt!RtlpHpSegAlloc+0x5a
ffff8086`58167560 fffff804`8665e3ef     : 00000000`00000e11 00000000`00000000 00000000`676c3256 fffff804`867bdc47 : nt!ExAllocateHeapPool+0x8ca
ffff8086`58167690 fffff804`86e9047d     : 00000000`00000040 00000000`00000001 00000000`00000600 ffffe709`00000000 : nt!ExpAllocatePoolWithTagFromNode+0x5f
ffff8086`581676e0 fffff805`127636b7     : 00000000`0006d000 ffffe709`0f045868 00000006`0155c000 00000000`00000000 : nt!ExAllocatePool2+0xdd
ffff8086`58167790 fffff805`12762ea6     : ffffe709`1d35d000 ffffe709`0f045868 00000000`00001000 fffff805`00000001 : vhdmp!AeSupportAllocatePages+0x2b
ffff8086`581677c0 fffff805`1276d12d     : ffffe708`f2fef8c0 ffffe708`1b9276a0 ffff8086`581678f8 00000000`0000007f : vhdmp!Vhd2iWriteLogEntry+0x56
ffff8086`58167840 fffff805`12766649     : ffff8086`581678f8 fffff804`866c96d7 00000000`00000000 ffffe708`f4ad9040 : vhdmp!Vhd2iAeCompletionTrampoline+0xd
ffff8086`58167870 fffff805`12763819     : ffffe708`f2fef8c0 ffffe708`f2fef8d0 00000000`00000000 ffffe708`d61feb50 : vhdmp!AeProcessTodo+0x79
ffff8086`581678c0 fffff805`1276cd0a     : ffffe708`f4ad9040 ffffe708`c7ebfca0 fffff804`87149ac0 00000000`00000000 : vhdmp!AeiDelayedCompletionWorkerRoutine+0xb9
ffff8086`58167950 fffff805`1276e74a     : 00000000`00000000 fffff804`87149ac0 ffffe708`c7ebfca0 fffff804`884b72f0 : vhdmp!VhdmpiVhd2AeWorkItemCallback+0x2a
ffff8086`58167990 fffff804`866ffa85     : ffffe708`c7ebfca0 ffffe708`00000000 ffffe708`00000000 fffff804`00000000 : vhdmp!VhdmpiAeWorkerRoutine+0x2a
ffff8086`581679c0 fffff804`86768687     : ffffe708`f4ad9040 fffff804`89398000 ffffe708`f4ad9040 004fe07f`b8bbbdff : nt!ExpWorkerThread+0x155
ffff8086`58167bb0 fffff804`86824ee4     : fffff804`8177b180 ffffe708`f4ad9040 fffff804`86768630 ff242222`ff242222 : nt!PspSystemThreadStartup+0x57
ffff8086`58167c00 00000000`00000000     : ffff8086`58168000 ffff8086`58161000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x34

SYMBOL_NAME:  vhdmp!AeSupportAllocatePages+2b

MODULE_NAME: vhdmp

IMAGE_NAME:  vhdmp.sys

STACK_COMMAND:  .cxr 0xffff990169fc6900 ; kb

BUCKET_ID_FUNC_OFFSET:  2b

FAILURE_BUCKET_ID:  AV_vhdmp!AeSupportAllocatePages

OS_VERSION:  10.0.22593.1

BUILDLAB_STR:  ni_release

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

FAILURE_ID_HASH:  {7930ece7-9840-eff2-1f65-52152d62c2c1}

Followup:     MachineOwner
---------
surban commented 2 years ago

@benhillis Ping.

wrwg commented 2 years ago

This happened to me at least three times in the last 1/2 year (always on the newest W11 insider beta build), last time a few days ago, and one time not only BSOD, but corrupted Ubuntu file system. Very annoying, because I had to reinstall and reconfigure the whole distro after that. Luckily, I had no unpushed commits.

I'm using Rust, and one way to nearly surely hit it is cargo clean for a lager project :-/

Please take this serious, it is so frustrating that I'm considering to walk away from WSL and return to direct Ubuntu, or just ditch the PC model, and buy me a Mac desktop.

benhillis commented 2 years ago

Thanks - I've followed up with the storage folks to see if this is a known issue.

benhillis commented 2 years ago

@surban - would it be possible to enable driver verifier and share the crashdump? Once you have the .dmp file you can send it to secure@microsoft.com and ask them to forward it to me.

verifier.exe /volatile /adddriver vhdmp.sys /flags 0x2b
<repro>
<share crashdump>
surban commented 2 years ago

Sure, I will do so when I am back at the system.

DanPinGF commented 2 years ago

@surban - would it be possible to enable driver verifier and share the crashdump? Once you have the .dmp file you can send it to secure@microsoft.com and ask them to forward it to me.

verifier.exe /volatile /adddriver vhdmp.sys /flags 0x2b
<repro>
<share crashdump>

Well, before doing that, I would hardly suggest reading the documentation provided here: https://docs.microsoft.com/en-us/windows-hardware/drivers/devtest/driver-verifier

Quarky93 commented 2 years ago

I am also having this issue. Bluescreen when removing large amount of data. I'm on Windows 11.

wynnw commented 2 years ago

This has happened consistently for me the whole time I've used wsl2 (1 year+). I reproduce by just downloading db backups over a certain size. The blue screen happens either by deleting the backup, or deleting loaded database files. Happens to co-workers too. Should be really easy to reproduce - my system specs match the original report pretty closely.

I resized my wsl2 filesystem thinking that might prevent the problem, but it hasn't. I have plenty of free space and it doesn't matter.

surban commented 2 years ago

@surban - would it be possible to enable driver verifier and share the crashdump? Once you have the .dmp file you can send it to secure@microsoft.com and ask them to forward it to me.

verifier.exe /volatile /adddriver vhdmp.sys /flags 0x2b
<repro>
<share crashdump>
C:\>verifier.exe /volatile /adddriver vhdmp.sys /flags 0x2b
The specified flags 0x0000002b are not supported in volatile mode.
Run "verifier /?" for command line assistance. See the "/dif /now" syntax for
enabling most flags without rebooting.

@benhillis Driver verifier does not accept the flags.

surban commented 2 years ago

@benhillis

I've reproduced the crash using Driver Verifier with the following command line.

verifier.exe /flags 0x2b /driver vhdmp.sys

I've sent the crash dump to secure@microsoft.com.

Quarky93 commented 2 years ago

Any update on this? Basically makes WSL unusable for me.

Thernn88 commented 2 years ago

I get this error as well when deleting large amounts of data.

Let me know if there is something I can add.

surban commented 2 years ago

I get this error as well when deleting large amounts of data.

Let me know if there is something I can add.

Can you check on Insider Build 25136? I haven't encountered this problem for a while anymore on recent builds, but perhaps this is just luck.

blackcon commented 2 years ago

Have it occured the crash into only SSD? I couln't occured the crash :( (I using a HDD)

Thernn88 commented 2 years ago

I get this error as well when deleting large amounts of data. Let me know if there is something I can add.

Can you check on Insider Build 25136? I haven't encountered this problem for a while anymore on recent builds, but perhaps this is just luck.

I don't use the insider builds because this is a production environment with looming deadlines. I only want to upgrade if it's certain to fix a bug. Post-deadline, sure, I'll give the 25xxx insider builds a whirl and see if I can't break it like a twig.

I've encountered this both on the current public release and on 22621.169. I can replicate it on multiple computers using clean builds. Using explorer.exe . and/or console commands to delete ~12 SQLite files (100GB total) in sequence inside WSL will cause a BSOD. If I space small files in between the SQLite files then no blue screen occurs. Deleting many large files of different filetype does not cause BSOD.

BOSD finally corrupted my WSL image this morning . Thankfully, I make frequent backups and have cloned the environment to other computers.

lx30011 commented 2 years ago

Had this happen too when deleting a 400 GB directory. During deletion (rm -rf) my PC blue screened with PAGE_FAULT_IN_NONPAGED_AREA. After rebooting I ran rm -rf on the directory again which completed this time, but it blue screened shortly after with KERNEL_MODE_HEAP_CORRUPTION. After rebooting I found my Debian corrupted. I have both .dmp files from C:\Windows\Minidump and a MEMORY.dmp from C:\Windows (same creation time as the second crash) which I could provide if they're of use.

Kangaxx-0 commented 2 years ago

I was experiencing the same issue when I deleting 30Gb files with rm-rf. I had to completely move my Rust dev env to Azure VM. This is really annoying

tony commented 2 years ago

This bit me randomly just now: I was using du to clean up junk: Rust took up a ton of space

I did rm -rf ~/projects/rustlang to just knock it out, blue screen. The build artifacts seem to take up quite a bit of space over time. e.g. if you have a larger open source rust project checked out.

Vorta commented 2 years ago

This happens to me every time I have to re-import a large database (100+ GB). The only solution I have is to drop tables one-by-one with a few seconds in between drops. I'm forced to do that because my WSL2 virtual disk got corrupted after a BSOD caused by deleting large files.

Is this something that is being worked on, as it will soon be a year since this issue was reported?

danieljose commented 2 years ago

It happened to me too, removing bloom (bigscience). The wsl install is few days old.

ariccio commented 2 years ago

I don't use WSL, but I think this must be a very buggy function. I've seen a few crashes in this driver lately, and the the most recent I got a chance to look at was in InCompletePacket+0x1e6, where it first calls memcpy, and then the crash hits in memcpy itself. It's memcpying something from the third parameter of InCompletePacket to some offset in the first param. It's all heavily optimized, and the memcpy implementation is one of the new fancypants hand rolled assembly versions that dispatches based on type, and I don't have the patience to try and work out the args from this deep in the function. I can't really keep using driver verifier, none of the crashes I have are caught by it, and it's insanely slow.

stevebarham commented 2 years ago

I have also experienced this issue, when dropping around 20k files summing to around 450GB of data. Intensely frustrating that this has been extant for so long.

kostiantyn-povnych commented 2 years ago

I'm constantly getting this when deleting large data inside WSL2 Ubuntu on Win11. This is extremely annoying.

Thernn88 commented 2 years ago

On my end I’ve not experienced this issue since updating to latest versions of WSL based on 5.15 kernel and Ubuntu 22.04

wynnw commented 2 years ago

@Thernn88 - what does wsl.exe -v show for you? Just curious as I continue to hit the crash and I think I'm up to date. My version is


Kernel version: 5.15.68.1
WSLg version: 1.0.45
MSRDC version: 1.2.3575
Direct3D version: 1.606.4
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22621.755```
odysa commented 1 year ago

Same issue here. It usually happened when I delete files by cargo clean.

hzxa21 commented 1 year ago

Same here with cargo clean on wsl2. very annoying since i had to reinstall the ubuntu distro and use some disk recovery tool to read back data in the vhdx file... Please prioritize the investigation and the fix.

elektracodes commented 1 year ago

Same issue here

Licenser commented 1 year ago

Same for me, constantly about once every other day :(

L1ghtman2k commented 1 year ago

Makes rust development on WSL pretty much impossible :/

PS. For anyone who needs to recover files from the corrupted vhdx(since I couldn't really find any info on how to repair existing vhdx), I used this utility for vhdx scanning, and file recovery: https://www.quetek.com/prod02.htm

yonillasky commented 1 year ago

Same happened to me after deleting 150GB worth of junk (a debug mode build of LLVM/clang). Not straight away, I ran a few more commands on the terminal and was then struck with IRQL_NOT_LESS_THAN_OR_EQUAL blue screen and data corruption on my WSL image :(

I actually had code below the WSL home dir that I don't want to lose...

hyjforesight commented 1 year ago

Same BUG here! Microsoft, start to work! VS Code Version: Version: 1.75.1 (user setup) Commit: https://github.com/microsoft/vscode/commit/441438abd1ac652551dbe4d408dfcec8a499b8bf Date: 2023-02-08T21:32:34.589Z Electron: 19.1.9 Chromium: 102.0.5005.194 Node.js: 16.14.2 V8: 10.2.154.23-electron.0 OS: Windows_NT x64 10.0.22621 Sandboxed: No OS Version: WIN11 22H2 X64, WSL2 (Ubuntu 22.04)

Benhawkins18 commented 1 year ago

Same bug. wsl The file or directory is corrupted and unreadable.

right after deleting a large directory ~200g

danieledagnelli commented 1 year ago

Exact same behaviour here right after deleting a large directory (200GB). Got BSOD, and now everything is dead.

kostiantyn-povnych commented 1 year ago

I have been encountering and standing this issue for many months until it finally corrupted the vhd disc. I recreated a new WSL with a different configuration attaching/mounting the entire physical disc avoiding the vhd driver. It seems it helped and I don't experience this issue anymore.

thecozies commented 12 months ago

i found another workaround if this issue happens to anybody else.

i really wish i had found this earlier after hours of panic and debugging. please fix this issue!