dokan-dev / dokany

User mode file system library for windows with FUSE Wrapper
http://dokan-dev.github.io
5.21k stars 661 forks source link

DOKAN_FILE_INFO struct passed for a different fileHandle #978

Closed DimitarKapashikov closed 3 years ago

DimitarKapashikov commented 3 years ago

Hello, guys.

I would like to ask you whether you have observed the following situation:

CreateFile is called for fileName A ReadFile is called for fileName B with DOKAN_FILE_INFO struct of file A !!

The described situation happens when there is a heavy load generated by https://support-desktop.sharegate.com/hc/en-us/articles/360045909372-File-share-property-extraction

Thank you very much for your attention Best Regards, Dimitar

Liryna commented 3 years ago

Hi,

It would be surprising to have this issue. I am happy to look into it if you are able to reproduce with any of the samples.

How do you know it is a read of file B if you get a struct of file A? Maybe thats just a read of file A no?

DimitarKapashikov commented 3 years ago

Hi @Liryna , thank you very much for the quick reply.

In the IDokanFileInfo context I store an object which holds among the other things the fileName which was passed when CreateFile was invoked.

What I am seeing in the traces is that, sometimes during the huge load, when ReadFile is invoked after CreateFile :

public NtStatus ReadFile(string filename, byte[] buffer, out int readBytes, long offset, IDokanFileInfo info) filename != info.Context.fileName

So, I presume that the IDokanFileInfo is not the right one which should have been associated with the filename file handle.

Liryna commented 3 years ago

Could it be that the file was renamed between the CreateFile and Read ? You can also use procmon application to track such workflow.

DimitarKapashikov commented 3 years ago

@Liryna Do you mean that it should be a MoveFile request between CreateFile and ReadFile?

Liryna commented 3 years ago

Yes, exact.

DimitarKapashikov commented 3 years ago

Hi @Liryna , if I collect debug traces from Dokany, would it be possible to investigate the issue ?

Liryna commented 3 years ago

@DimitarKapashikov Yes, you can directly share them here.

DimitarKapashikov commented 3 years ago

@Liryna Additional info. I have found that in the same session there are calls for GetFileInformation and there the passed fileName is the correct one. Only in the ReadFile the fileName mismatch the fileName from the context

Liryna commented 3 years ago

@DimitarKapashikov Any update or steps to reproduce the issue with the official sample ? Note that I believe if such issue was present, we would have a lots of feedbacks and seeing the mirror storing the handle in it to report incorrect data read and corrupt write.

DimitarKapashikov commented 3 years ago

@Liryna We are trying to reproduce it with kernel logs again. I will update you ASAP.

Liryna commented 3 years ago

@DimitarKapashikov Please try to build a head if that's possible. There is new logs that will be much helpful.

DimitarKapashikov commented 3 years ago

@Liryna We are using the latest dokany driver. Do you mean to build it myself from the master ?

Liryna commented 3 years ago

Yes, if you could retest with the driver and library built yourself from the master. Or simply use a snapshot from the master https://github.com/dokan-dev/dokany/wiki/Build#user-snapshot

Please also try to reproduce with mirror or memfs to be sure this is not an implementation issue on your side.

DimitarKapashikov commented 3 years ago

@Liryna Thanks, is the build in https://github.com/dokan-dev/dokany/wiki/Build#user-snapshot with DEBUG information , looking at the size I suppose it is not ?

Liryna commented 3 years ago

Hum indeed you are right, forgot that is was a build Release :sad: sorry.

DimitarKapashikov commented 3 years ago

@Liryna Thanks. What we have found in addition is that when we copy the same content from the mounted drive to the file system, the ReadFile issue does not persists. It happens only when it is copied using the Sharegate tool , which copies the content from the mounted drive to SharePoint. I don't know how this could happen on Win32 API level, to mix the file handles !

Liryna commented 3 years ago

@DimitarKapashikov have you been able to build a debug version or have more information on the issue ?

DimitarKapashikov commented 3 years ago

Hi @Liryna , we have managed to build a driver with debug, when we gather the needed information I will post it in the issue.

DimitarKapashikov commented 3 years ago

Hi @Liryna I am attaching a log file with debug info, the issue is not reproduced in it, but I just want to ask you to take a look if you have in it the needed information from the driver. [Uploading #dbg1.log…]()

Liryna commented 3 years ago

The file looks to not be available

DimitarKapashikov commented 3 years ago

@Liryna Something went wrong during the upload, please check again dbg1.log

Liryna commented 3 years ago

@DimitarKapashikov The kernel logs are missing. Please see the wiki to get the command that enable them for the driver.

DimitarKapashikov commented 3 years ago

Hi @Liryna , when trying to reproduce the issue, the windows machine is crashing . These are the kernel logs. dbg.log

Could it be because of our custom build driver?

Regards, Dimitar

Liryna commented 3 years ago

I see that now the logs contains the driver logs. Could next time disable the user land log capture to not have the "Filematch?" part of the output.

For the crash, we will need the memory dump analysis that you can get with windbg and the symbol path configured. The wiki dokan debug page has the steps in crash report section.

DimitarKapashikov commented 3 years ago

Hi @Liryna , here is the crash report: report.log

Liryna commented 3 years ago
STACK_TEXT:  
ffffcb01`69d21968 fffff801`568cddf2 : 00000000`00000019 00000000`00000020 ffff808a`c4657900 ffff808a`c46579e0 : nt!KeBugCheckEx
ffffcb01`69d21970 fffff801`3c2d8d6d : ffff808a`c4657910 00000000`00000000 ffff808a`c6c4a84a 00000000`0000000e : nt!ExFreePoolWithTag+0x1a62
ffffcb01`69d21a50 fffff801`3c2d91ce : ffff808a`c5ba8b20 ffff808a`c5ba8b50 ffff808a`c481e800 00000000`00000000 : dokan1+0x28d6d
ffffcb01`69d21af0 fffff801`566ee0c1 : ffff808a`c5ba8a60 fffff801`3c2d8e90 ffff808a`c5ba8a60 fffff801`5676163a : dokan1+0x291ce
ffffcb01`69d21c10 fffff801`567dc206 : ffffcb01`68855180 ffff808a`c481e800 fffff801`566ee080 ffff808a`c7027740 : nt!PspSystemThreadStartup+0x41
ffffcb01`69d21c60 00000000`00000000 : ffffcb01`69d22000 ffffcb01`69d1c000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x16

Function names are missing. Looks like C:\install\Dokan Library-1.4.1 was not loaded, is that where the pdb from your build are ? Otherwise, can you describe how I can reproduce this on my side ?

DimitarKapashikov commented 3 years ago

Could please check ,this one report.log

Liryna commented 3 years ago

It use the pdb from c:\program files\dokan\dokan library-1.4.1 and not from your build.

The line do not seem to match what is at head dokan1!NotificationLoop+0x32d [C:\GIT\dokany\sys\notification.c @ 328] https://github.com/dokan-dev/dokany/blob/master/sys/notification.c#L328

DimitarKapashikov commented 3 years ago

Hi @Liryna , we have build the driver at this commit https://github.com/dokan-dev/dokany/blob/31f8381e1efc207e7c02bdea0a47c819e9810c3f/sys/notification.c . It is from 12 of April.

Liryna commented 3 years ago

@DimitarKapashikov I am having difficulty to see the reason of this failure. Can you describe how I can reproduce this on my side ?

DimitarKapashikov commented 3 years ago

@Liryna I reproduce the issue using https://sharegate.com/products/sharegate-desktop . I copy content from network driver to Sharepoint using the desktop tool . I will try to revert the driver to the official version 1.4.1 to see , if there is the same issue.

Liryna commented 3 years ago

@DimitarKapashikov I am spending time on this issue for you. So if you could take a couple of minutes to provide accurate steps and information, it would be great. If you do not feel having the time for this, please close the issue.

DimitarKapashikov commented 3 years ago

@Liryna I have tried to reproduce the crash with Mirror.exe and the driver which I downloaded from https://ci.appveyor.com/project/Maxhy/dokany/branch/master/job/ybmai4rwtixj47ou/artifacts . What I did is to mount the C:\Temp (mirror.exe /n /r C:\Temp /l m) . Before mounting I copied the Dokan Library-1.4.1 folder from C:\Program Files\Dokan\Dokan Library-1.4.1 to C:\Temp. After mounting I tried to copy the Dokan Library-1.4.1 folder to Desktop ,after some seconds the windows crashes. It seems that a mass Read requests send to the driver causes the crash.

Liryna commented 3 years ago

We agree that those steps do not involve Sharegate ? I have followed the steps and install this exact version. Manual copy (explorer) and robotcopy haven't triggered any crash on my side.

For you information, it is really really not recommended to use /n option without an UNC (like /u \myfs\dokanvol). If you do not set one, please try with and see how it behaves.

image image

Liryna commented 3 years ago

I made some changes at head. It would be great if you could retry and see if you still have a bsod.

DimitarKapashikov commented 3 years ago

Hi @Liryna Yes , I tried to remove Sharegate from the setup. I will test with the latest build and let you know the result.

DimitarKapashikov commented 3 years ago

Hi @Liryna , it crashes again. Do you need the dump analysis report?

Liryna commented 3 years ago

@DimitarKapashikov Yes, if you can provide the memory dump file, exact commit you used. Can you confirm you are just copying files and it crash ? That is pretty surprising. Is it a clean environment? Can you try on another one ?

DimitarKapashikov commented 3 years ago

Hi @Liryna Yes I am just copying the folder. It is not clear environment, previous version were installed on the Windows. I will try on a clean one.

DimitarKapashikov commented 3 years ago

@Liryna About the dump, I can upload it on chunks , because of the 25mb limitation , it is OK ?

Liryna commented 3 years ago

Normally if you zip it is much smaller. Otherwise use Google drive or Dropbox to share it.

DimitarKapashikov commented 3 years ago

@Liryna I shared it in google https://drive.google.com/drive/folders/1v5Dt4Q35haURT3XVz9ugmYeuLm8mg4s4?usp=sharing

Liryna commented 3 years ago

@DimitarKapashikov Thanks, can you also share the binaries with the pdb that you used during this bsod ? Also the commit number

DimitarKapashikov commented 3 years ago

@Liryna I used, the latest build from https://ci.appveyor.com/project/Maxhy/dokany/branch/master/job/olreclh1nudl05mn/artifacts . I saw in WinDBG , that it tries to find the pbd here : DBGHELP: C:\projects\dokany\x64\Release\Driver\dokan1.pdb - file not found DBGHELP: dokan1 - no symbols loaded

I suppose this is a path on the machine where the driver is built.

Liryna commented 3 years ago

Thanks I have been able to analyze the dump. I would really need to be able to reproduce the issue to find out what is exactly happening and how the alloc header is corrupted. Could you start from a fresh env, see if you reproduce and then start to install other apps until you get the bsod ?

Liryna commented 3 years ago

@DimitarKapashikov I probably found the reason of your crash. Could you try the new 1.5.0 and let me know ?

DimitarKapashikov commented 3 years ago

@Liryna yes, you found it, thank you very much, it does not crash with 1.5.0. Regarding the initial issue, I suppose that in the 1.5.0 there is the debug info which will help to investigate further the initial issue. We will test our implementation with 1.5.0 and as soon we have the traces I will post it.

Liryna commented 3 years ago

Thanks! You might still need to build a debug version to have the logs....but at least now it is the new one :)

DimitarKapashikov commented 3 years ago

@Liryna Regarding the new release, I managed to run the Mirror only by putting the windows in test mode http://www.queryadmin.com/147/enable-disable-windows-7-test-mode/ otherwise I got the following message :

C:\Program Files\Dokan\Dokan Library-1.5.0\sample\mirror>mirror.exe /m /o /r C:\Temp /l m [Mirror] Failed to add security privilege to process => GetFileSecurity/SetFileSecurity may not work properly => Please restart mirror sample with administrator rights to fix it Can't install driver

Is there a way to fix this?

Liryna commented 3 years ago

Ah I wonder if this is a binary sign issue...I will check this next week. Thanks for the feedback!