john-peterson / windows

0 stars 0 forks source link

Allow cancellation in `\Driver\Disk` and `\FileSystem\Ntfs` #1

Open john-peterson opened 11 years ago

john-peterson commented 11 years ago

allow cancellation in \Driver\Disk and \FileSystem\Ntfs

references

my references are

this message is at

do you want to add an option to cancel the NtReadFile request that hangs in this example \Driver\Disk and \FileSystem\Ntfs?, because

do you want to change the storage driver so that it calls KeBugCheck when it enters the state described in "Symptoms 2013-10-30"?, because

My files about this problem are in

This information is relevant for my problem

Can't kill vds.exe

vds.exe is still in the ProcessHacker.exe process list after this taskkill command

taskkill /f /im vds.exe
SUCCESS: The process "vds.exe" with PID 185176 has been terminated.

My livekd version is

livekd
LiveKd v5.31 - Execute kd/windbg on a live system

vds.exe has one IRP

!process 0 7 vds.exe

        IRP List:
            fffffa801ea1c010: (0006,0988) Flags: 00000884  Mdl: 00000000

livekd don't print the vds.exe IRP stack because Windows has 31 IRP stacks

!irp fffffa801ea1c010

Irp is active with 31 stacks 30 is current (= 0xfffffa801ea1c908)
 No Mdl: No System Buffer: Thread fffffa8023e79060:  Too many Irp stacks to be believed (>30)!!

Can't kill mpc-hc.exe

mpc-hc.exe is still in the ProcessHacker.exe process list after this taskkill command

taskkill /f /im mpc-hc.exe SUCCESS: The process "mpc-hc.exe" with PID 180260 has been terminated.

mpc-hc.exe has one IRP

!process 0 7 mpc-hc.exe 

    IRP List:
        fffffa802baee670: (0006,0988) Flags: 00000404  Mdl: 00000000    

livekd don't print the mpc-hc.exe IRP stack because Windows has 31 IRP stacks

!irp fffffa802baee670

Irp is active with 31 stacks 30 is current (= 0xfffffa802baeef68)
 No Mdl: No System Buffer: Thread fffffa801ec6e270:  Too many Irp stacks to be believed (>30)!!

Disks

I have 6 USB disks (3 Seagate Expansion Desktop USB 3.0 3 TB and 4 TB, 3 WD Elements Desktop USB 2.0 2 TB and 3 TB)

1 is connected directly to the ASMedia xHCI Controller

The other 2 USB 3 disks are connected to a Deltaco UH-723

The other 3 disks are connected to a D-Link DUB-H7

all USB disks physically disconnected

I've disconnected all 6 disks by disconnecting the 1 disk directly connected disk and disconnected the 2 USB hubs

RemoveDrive hangs its conemu window

This command hangs its conemu.exe window indefinitely (because it's caught in the IRP hang)

RemoveDrive h:

DriveCleanup don't fix the problem

I've run this command without solving my problem (unhanging the IRP hang)

DriveCleanup

Removed 3 USB devices
Removed 0 USB hubs
Removed 3 Disk devices
Removed 0 CDROM devices
Removed 0 Floppy devices
Removed 6 Storage volumes
Removed 0 WPD devices
Removed 49 Keys from registry

ListDosDevices print all 6 USB devices despite that they're physically disconnected

ListDOSdevices

Drv Type       KernelName

A:  ----
B:  ----
C:  FIXED      \Device\HarddiskVolume2
D:  FIXED      \Device\HarddiskVolume4
E:  FIXED      \Device\HarddiskVolume6
F:  ----
G:  FIXED      \Device\HarddiskVolume14
H:  FIXED      \Device\HarddiskVolume12
I:  ----
J:  ----
K:  ----
L:  ----
M:  ----
N:  FIXED      \Device\HarddiskVolume10
O:  FIXED      \Device\HarddiskVolume7
P:  ----
Q:  ----
R:  ----
S:  ----
T:  ----
U:  ----
V:  ----
W:  ----
X:  CDROM      \Device\CdRom0
Y:  ----
Z:  ----

This command works

DeleteDosDevice h:

h: removed

DeleteDosDevice g:

g: removed

After DeleteDosDevice ListDOSdevices don't list G and H anymore

ListDOSdevices

Drv Type       KernelName

A:  ----
B:  ----
C:  FIXED      \Device\HarddiskVolume2
D:  FIXED      \Device\HarddiskVolume4
E:  FIXED      \Device\HarddiskVolume6
F:  ----
G:  ----
H:  ----
I:  ----
J:  ----
K:  ----
L:  ----
M:  ----
N:  FIXED      \Device\HarddiskVolume10
O:  FIXED      \Device\HarddiskVolume7
P:  ----
Q:  ----
R:  ----
S:  ----
T:  ----
U:  ----
V:  ----
W:  ----
X:  CDROM      \Device\CdRom0
Y:  ----
Z:  ----

When selecting G: in explorer.exe (that still list all 6 disconnected disks) it displays this dialog

[Window Title]
Location is not available

[Content]
G:\ refers to a location that is unavailable. It could be on a hard drive on this computer, or on a network. Check to make sure that the disk is properly inserted, or that you are connected to the Internet or your network, and then try again. If it still cannot be located, the information might have been moved to a different location.

[OK]

file table of physically disconnected disks displayed

When selecting a disconnected disk that's not removed with DeleteDosDevice explorer.exe display its file and folder content (apparently Windows has cached the whole file table)

dir also display the file table for a physically disconnected disk

dir O:
 Volume in drive O is Elements 2
 Volume Serial Number is 0861-0980

 Directory of O:\

2014-02-27  08:36    <DIR>          nature
2013-10-15  14:51    <DIR>          sample
2014-02-02  15:27    <DIR>          video
               0 File(s)              0 bytes
               3 Dir(s)  419 365 875 712 bytes free

But it can't read any file content

copy O:\nature\wms.avi C:\Users\User\Downloads
The system cannot find the file specified.
        0 file(s) copied.

explorer.exe automatic refresh is disabled

When a new file is created in an explorer.exe window the windows isn't automatically refreshed. The file is shown only after manual refresh

unkillable processes

After Ctrl + A and Delete in ProcessHacker.exe the unkillable processes (and other processes that weren't killed before processhacker.exe killed itself) sorted by Start Time are in "unkillable processes" in irp_hang.txt

When all processes in processhacker.exe except processhacker.exe are selected and killed Windows bugcheck with the message

Probably caused by : _

CRITICAL_OBJECT_TERMINATION (f4)

Symptoms 2013-10-30

File content replaced with 0

When terminating the power in this example the data in some open files (for example settings for running programs such as %AppData%\Skype\user\config.xml) on drives that have not malfunctioned is sometimes replaced with 0 (null, not letter 0) which is disruptive to the user

Windows is supposed to avoid data loss by calling KeBugCheck before data loss can occur

All file writes in this state write 0 (or appear to write 0 but in fact write nothing and allocate a section of 0 or other data to the file) to both a functioning SATA drive and the reconnected (malfunctioning before reconnection) USB drive

File creation (NTFS index write) is also not committed so that created files are not present after a kernel restart.

Before the kernel is restarted there's no indication that data is not written apparently because the data returned when reading the file (before the kernel is restarted) is from the file cache rather than committed data. After the kernel is restarted it will show the written (or incorrectly allocated) data which is 0 (when the file is content is displayed in a hex editor or notepad that display 0)

The second time this state occurred I made the mistake to (i) close all non-hanged processes (this resulted in the processes writing 0 as data to their settings files) and (ii) make a backup (with copy and robocopy) of all vital files before terminating the power. This effectively removed the backup of all vital files because all those files (374,297 files, 424 MiB) were written 0 to

It's possible that the shadow copy function, which store writes to a file in a temporay file while the file is read, is involved

It appear that the storage driver is unable to commit data to disk because part of the driver is waiting (indefinitely) in at least one process so that the data commit jobs are placed in this queue and are never completed. The file cache jobs can complete however and therefore it can appear that data is written because reading a file which previous write is cached but not committed will return data from the file cache.

Connected disk

The non-malfunctioning drive is connected through SATA and the malfunctioning drive through USB

Disk light

The USB drive signal an error by a steady flash from the status light. Reconnecting the drive will recover its normal operation but the I/O request is not canceled

Reconnecting refer to the USB cable (rather than the power) and it can from that be determined that the fault is a controlled unavailability (rather than a malfunction in the firmware) that's restored on reconnect. Restarting the kernel without disconnecting the USB cable has the same effect and restores its operation (a USB drive firmware reset is not necessary)

System

2014-03-02

My Speccy output is

Operating System
    Windows 7 Ultimate 64-bit SP1
CPU
    Intel Core i7 3770K @ 3.50GHz   49 °C
    Ivy Bridge 22nm Technology
RAM
    32,0GB Dual-Channel DDR3 @ 668MHz (9-9-9-24)
Motherboard
    ASUSTeK COMPUTER INC. P8Z77-M PRO (LGA1155) 37 °C
Graphics
    DELL U2312HM (1920x1080@60Hz)
    BenQG2222HDL (1920x1080@60Hz)
    Intel HD Graphics 4000 (ASUStek Computer Inc)
    1024MB ATI AMD Radeon HD 6800 Series (XFX Pine Group)   86 °C
Hard Drives
    233GB Samsung SSD 840 EVO 250GB ATA Device (SSD)    30 °C
Optical Drives
    QBCNK MRSHA3S5 SCSI CdRom Device
Audio
    Corsair Vengeance 2000 Headset

My software is described in "Software" at https://github.com/mirror/asus/issues/2