dokan-dev / dokany

User mode file system library for windows with FUSE Wrapper
http://dokan-dev.github.io
5.25k stars 666 forks source link

Occasionally creating a directory will fail #818

Open pollylm opened 5 years ago

pollylm commented 5 years ago

Environment

Check List

Description

When mounted as a network drive, if you repeatedly create and delete a directory then it will eventually fail. It can take anywhere from 5 to 800 times to fail. This never occurs when mounting as a normal drive (have tested up to 10,000 iterations).

As far as I can see, problems begin after the account name starts to print as 'SYSTEM'. For some reason when mounted as a network drive it there appears to be two different access tokens in use, though I have no idea why. When mounted as a regular drive the account name is never 'SYSTEM'.

I mounted the network drive with the options:

mirror.exe /s /d /n /t 1 /r

Logs

Single threaded log dokan only minimal test single thread.log

Multi-threaded log dokan only minimal test.log

I have attached the logs for multi-threaded run as well as it had a strange behaviour where two processes took the same handle (0133). I'm not sure if that would have caused any problems. I don't think it would affect my current issue though as it still failed as a single-threaded application

pollylm commented 5 years ago

Hi all.

My assertion was wrong in the original post, this is an issue that affects all versions (since at least 1.0.5) regardless of how the volume is mounted. It is just a lot more regular in 1.3.0.1000 when mounted as a network drive.

I ran several versions of dokany 3 times to check how many times of recreating a directory it would take before the mkdir call failed;

1.0.5.1000 network drive: 7538, 191430, 180839 1.0.5.1000 regular drive: 504957, 83225, 102414

1.1.0 network drive: 8876, 133077, 132777 1.1.0 regular drive: 155701, 128160, 4265

1.2.2.1000 network drive: 208233, 340988, 772136 1.2.2.1000 regular drive: 18055, 410294, 432345

1.3.0.1000 network drive: 661, 633, 685 1.3.0.1000 regular drive: 128144, 804424, 118907

In the past it would take a very high number of runs to fail. It seems like something has changed in the newer version to make it a more regular occurrence.

I'm currently working on getting logs for the older versions of dokany but they are taking much longer to reproduce. It could be a few days before I have them ready for you.

pollylm commented 5 years ago

Sorry for the slow reply. I have not been able to get the appropriate debug logs because the task is refusing the fail in debug mode on the old builds. It seems that only mounting as a network drive in 1.3.0.1000 is regular enough to allow me to collect debug logs.

Let me know if there is anything I can do to help with this issue, but at the moment I have no idea of where to start looking in the code

Liryna commented 5 years ago

Hi @pollylm ,

Thank you for your report. I haven't seen the behavior you are describing. Yes logs are needed here. It could be different applications. Could you point at which moment in the single thread log you had the error ? For the multi-thread, if a process opens the folder when the delete is/will call, it may fail the delete depending on the sharing requested during the createfile.

pollylm commented 5 years ago

The point of failure in the single thread log appears to be

Create 0046

CreateFile : C:\Users\matrixstore\Desktop\dokan\folder AccountName: matrixstore, DomainName: matrixstore-PC ShareMode = 0x7 FILE_SHARE_READ FILE_SHARE_WRITE FILE_SHARE_DELETE DesiredAccess = 0x80 FILE_READ_ATTRIBUTES FlagsAndAttributes = 0x200000 FILE_FLAG_OPEN_REPARSE_POINT OPEN_EXISTING error code = 2

All following handles report the same error code. However, this error has been reported and overcome in previous parts of the logs so I'm not sure what makes this section different

I'm still wondering if it is to do with the fact that we start reporting 'AccountName: SYSTEM, DomainName: NT AUTHORITY' at handle 39. I'm not confident about that but I can't see what else has changed.

I'm trying again to get debug logs for 1.3.0.1000 regular mount but so far no failure whilst debugging. Hopefully that will change soon.

I am running these tests by mounting the mirror class, then run a java class that repeatedly makes and deletes a folder on the mount. I can't upload the java file, but the code I'm running is

import java.io.File;
public class MkdirTest {

    public static void main(String[] args) {
        MkdirTest test = new MkdirTest();
        test.mkdirTest();
    }

    private void mkdirTest() {
        try {
            File f = new File("M:\\folder");
            for(int i=0; i<1000000; i++) {
                f.mkdir();
                if (!f.exists())
                    throw new RuntimeException("Failed to create folder. Iteration " + i);
                f.delete();
                if (f.exists())
                    throw new RuntimeException("Failed to delete folder. Iteration " + i);
            }
        } catch (Exception e) {
            System.out.println(e.getMessage());
        }
    }
}

Mount mirror: mirror.exe /s /d /n /t 1 /r C:\Users\matrixstore\Desktop\dokan /l m Build java file: javac MkdirTest.java Run java file: java MkdirTest

Liryna commented 5 years ago

The status of 0046 is file not found as expected since ###Cleanup 0045 was the operation to remove it. You can use procmon to look at logs with a release version.

Liryna commented 5 years ago

@pollylm have you been able to get more info on this ?

pollylm commented 5 years ago

I have found a consistent pattern; On the times that we receive the expected 'NAME_NOT_FOUND' status, a call has just been made by Explorer.EXE for operation 'NotifyChangeDirectory'. On the times that we receive the problematic 'ACCESS_DENIED' this call is not made.

This pattern has been consistent across multiple tests. I have no idea why the explorer process sometimes does not send this notification however. It also raises the question if this is actually a problem with Dokan or Windows. Is it necessary for Dokan to depend on that notification?

failed passed passed2