dokan-dev / dokany

User mode file system library for windows with FUSE Wrapper
http://dokan-dev.github.io
5.25k stars 666 forks source link

Rewrite user-mode threading to work with overlapped IO/IOCP #210

Open Corillian opened 8 years ago

Corillian commented 8 years ago

The current threading model used by Dokan is a major performance bottleneck in any user-mode driver that can make asynchronous IO calls - for example a driver that provides access to an FTP server. The OS can potentially make thousands of IO calls a second but with a limit of 15 threads and no async callback mechanism those threads will spend most of their time blocking on IO operations (in the FTP example they will block on socket read/writes).

To solve this I propose the following:

  1. DeviceIoControl() in dokan.c needs to use an IO completion port for dispatching IO events to the user-mode driver. IO events will automatically be handled on the system thread pool and neither Dokan nor the user-mode driver are required to manage their own threads nor worry about the number of threads that are allocated.
  2. A flag should be added to Dokan context information provided to the event handler that allows it to specify whether or not the event will be handled asynchronously. This is because the end of an asynchronous call requires special handling - it must explicitly tell Dokan to send a response to the kernel.
  3. The Dokan user-mode Dispatch*() functions will need to be split in two: one half (BeginDispatch*()) for event setup and the call into the driver event handler and one half (EndDispatch*()) for validation, cleanup, and sending the result to the kernel. Finalization for an asynchronous event (EndDispatch*()) can be initiated by a DokanFinishAsyncEvent() function called from the driver that takes a pointer to the context information as an argument.
  4. Currently information is passed to the user-mode driver and back to the kernel via function arguments. This would need to be changed to a pointer to a structure that's internally held by the Dokan event context info so that DokanFinishAsyncEvent() knows where to find the information it needs to pass back to the kernel.
  5. Dokan context objects should be pooled and push/popped from a global stack to prevent the driver from constantly hitting the memory allocator. A robust implementation should also include time stamps so that a number of context objects in the global pool beyond some threshold can be cleaned up after some amount of time passes to account for large temporary spikes in activity.

Unfortunately this both a non-trivial undertaking and would require breaking API changes however my driver is already running into problems with throughput due to restrictions imposed by the current Dokan threading model.

Liryna commented 8 years ago

Great propsition @Corillian ! I totally agree with you :heart:

Removing the threads limitation would enhance a lot the dokan speed also !

If I understood well, most of the changes are going to be in the Dokan Library and we are keeping the DeviceIoControl for communication Library/Kernel, are you we gonna break API compatibility in this communication ?

About point 5, is it not related to https://github.com/dokan-dev/dokany/issues/148 ?

It seems that you have a lot investigate around how to make it, do that mean it is something that you are already planning to do ?

Corillian commented 8 years ago

About point 5, is it not related to #148 ?

Yes point 5 is identical to #148 except I don't believe the lookaside list API is available in user mode.

If I understood well, most of the changes are going to be in the Dokan Library and we are keeping the DeviceIoControl for communication Library/Kernel, are you we gonna break API compatibility in this communication ?

All of the changes would be in the user mode dokan.dll. It will keep DeviceIoControl() and it will involve breaking API changes as all of the event handler functions will need to have their arguments packed into (pooled) dynamically allocated structures.

I have no problem doing this change myself however I don't have a lot of time at the moment. There's a good chance I wouldn't be able to get to it until sometime in June at the earliest but that could change (for better or worse).

Liryna commented 8 years ago

Oups yes lookaside does not exist for user mode ofc :smile: sorry

Ok I see the breaking changes you are talking about. It would be nice if we can just add a proxy fonction for each event that would convert the dynamically allocated structures into current function parameters and the other way back at the user function return. If we really cannot avoid it, we will deal with it. It is just that I really would not like to force people to update all their implementation every version :)

Don't worry about time, we also currently have couple of things that we need to do to push for pushing this 1.0.0 out :disappointed: (some stability issue)

JohnOberschelp commented 8 years ago

I have also been happily and successfully using Dokany from the beginning and have also run into the bottleneck of waiting on IO.

I think a much easier solution that does not break our API, would be for the driver to handle the return value STATUS_PENDING from (at least) user mode function dokanOperations->ReadFile.

All existing Dokany projects would continue to operate, and developers that want concurrent IO could do their own that worked by returning STATUS_PENDING in conjunction with IO handler thread(s).

marinkobabic commented 8 years ago

Driver can only communicate to the user mode if the user mode tells the driver "here you have a worker". How would that happen using the approach suggested by @Corillian? Need to understand the full flow.

The approach suggested by @JohnOberschelp seems to me much simpler. But in this case one thread will still be blocked. If driver gets STATUS_PENDING it will make the same call again using maybe even the same thread, until it gets another STATUS. Here we are limited in the amount of threads resp. workers provided to the driver. Possible solution here is to create a new additional thread resp. worker in user mode, if user mode returns STATUS_PENDING. STATUS_PENDING means this thread is still working resp. thread is blocked. We need to keep the old thread running with setting a special async flag in the context like suggested by @Corillian. If once this special threads have been handled properly those can be terminated. The thread amount specified in dokan options is just the minimum amount of threads which will be handled in parallel.

In generally we can increase the amount of threads in the dokan options to 64. This can be done already yet.

JohnOberschelp commented 8 years ago

You are right, @marinkobabic, one thread will be blocked, but not the rest of the threads.

Just by implementing handling of STATUS_PENDING on reads, and implementing lazy writes in userland, all Dokany threads would be able to go as far as they can till they are I/O bound. For read in particular, all the driver may need to do (as @marinkobabic said) when it receives a STATUS_PENDING, is yield to other threads, and try again; the read request is still completely valid. @Corillian, wouldn't this be all you need to get your FTP requests running in parallel?

Corillian commented 8 years ago

DokanMain() will create an IO completion port, bind it to a thread pool (https://msdn.microsoft.com/en-us/library/windows/desktop/ms686760(v=vs.85).aspx), open the device with overlapped IO enabled, bind the device to the IO completion port, queue an async DeviceIoControl(), and then return (the current implementation blocks) while passing a state object back to the user. When a completion packet is available it will be dequeued on the thread pool bound to the completion port and a callback provided by dokan.dll will be called. This callback will then call the appropriate BeginDispatch*() for the packet which will remove a dispatch context object from the object pool (or allocate a new one if the pool is empty), fill in its arguments according to packet type, and then call into the user mode driver via the callback specified during Dokan setup (i.e. dokanOperations->ReadFile).

Inside the user mode driver callback a pointer to a structure with the relevant arguments for that packet will be provided. It should be done this way because we've then gone through the trouble of creating a proper structure to marshal packet arguments so that driver writers don't have to constantly reinvent the wheel for their own async operations. They just pass around a single pointer and reference it as needed. If the user mode driver callback will be asynchronous it will return STATUS_PENDING which tells Dokan to immediately exit the completion packet callback so that the thread can begin servicing other completion packets. If it's a synchronous callback it'll return its normal error code and Dokan will then automatically call EndDispatch*() to release the dispatch context object back to the object pool as well as return the result to the kernel.

Outstanding asynchronous operations will also need to be tracked by Dokan so that during shutdown all pending IO operations will be verified complete before tearing down the IO completion port and the thread pool. Once a user-mode asynchronous operation completes it will need to call DokanFinishAsyncEvent() which will cause Dokan to call the appropriate EndDispatch*(). We should also make our thread pool and IO completion port available to user mode drivers since a thread can only be bound to a single IO completion port and there's no reason to force driver writers to create another thread pool.

marinkobabic commented 8 years ago

@JohnOberschelp If you write so many files as you have threads, then all of them will be blocked.

@Corillian This will be an amazing improvement. I totally agree. :smiley:

suy commented 8 years ago

I agree that the current API is weird, and doesn't make much sense for some cases (my filesystem queries the network asynchronously too, so I have to workaround it by using promises and futures, and blocking the callback in DokanOperations). But note that is pretty easy to implement a filesystem in Dokan+FUSE this way.

Also, I'm not very familiar with the Windows API, and even if I were, I will have to use both Windows' and Unix's if I had to write a cross-platform filesystem. My current approach for that is using a cross platform library that isolates me from that as much as possible.

An alternative would be to document the protocol to talk to the kernel device, am I right? FUSE is considering it and there is already a protocol sketch.

That would be helpful for people who want to use Dokan with any language.

Maxhy commented 8 years ago

Interesting discussion you started @Corillian. Current threading model was always a bother to me so I'm glad to see suggestions. My opinion is that it currently works and only result to performance issue so even if I totally agree, I will personally do on it after few others issues / bug fixes. Or maybe you already planned to work on it on your side as you already contributed several times? Better to know it to avoid work duplicate as happened with WiX installer :smile:.

@suy For now, most current language Dokan libraries (C#, Java, ...) are wrapping the C resulting dynamic library dokan1.dll. I currently prefer this way to avoid user-mode handling work duplicate (libraries would have to through native win32 call anyway to talk to the driver). But documenting the protocol to talk to kernel device should be done for the sake of documentation indeed!

Liryna commented 8 years ago

Hi @Corillian ,

Since it is june, I just wanted to know if you got any time on this ? (No need to hurry, 1.0.0 is even not released yet :smiley: )

Corillian commented 8 years ago

No unfortunately the stuff I needed to get done before I could address this ended up taking a bit longer than I had anticipated. I noticed that 1.0.0 isn't released yet so I figured the delay on my end wouldn't be a huge issue ;). I'm still planning on implementing this, hopefully soon as the lack of throughput is a near constant complaint from users of my driver.

Corillian commented 8 years ago

Alright I've started working on this. I won't be working on it every day so I don't know exactly how long it will take but it's currently under way =).

Liryna commented 8 years ago

@Corillian :+1: don't hesitate if you need help!

Corillian commented 8 years ago

@Liryna Why is ZwCreateFile() called twice for SL_OPEN_TARGET_DIRECTORY?

Liryna commented 8 years ago

@Corillian I don't have my environment to test with me but I would say that this come from the last pull requests of @bailey27.

If I remember, there should be a first createfile to know if the file exist and a second for opening the directory.

Or do you mean you have two time the same createfile?

bailey27 commented 8 years ago

@Corillian ,

I changed the behavior of SL_OPEN_TARGET_DIRECTORY to fix https://github.com/dokan-dev/dokany/issues/249 (MS Word 2007 & 2010 can not save file normally in mirror.exe).

If SL_OPEN_TARGET_DIRECTORY is specified, then the caller expects to open the parent dir of the file. But it also expects to be told in create Information if the child did not exist.

Corillian commented 8 years ago

Going through directory.c it seems to me that since Dokan caches the previous directory listing and provides its own pattern search maintaining FindFilesWithPattern() as a Dokan operation is a bad idea. @Liryna any opposition to removing it?

Also is it correct to assume that the kernel synchronizes all file operations on a file handle? If that's not true then there are quite a few synchronization bugs in the user-mode driver.

Lastly shouldn't directory list caching be done in the kernel and not the user-mode driver? I don't understand why we have to endure 2 user/kernel transitions when SL_RESTART_SCAN isn't specified just because the list gets cached in user space instead of the kernel.

marinkobabic commented 8 years ago

Independent where the caching is done, how is the following scenario covered here:

  1. FindFiles returns File1, File2
  2. The mirrored File1 is removed.
  3. Only the application knows this (maybe), which uses the dokan usermode library
  4. UserMode and Kernel have no idea, that the file has been removed

Actually it seems that the cache is not updated.

To make the whole system stable and faster:

This way the file and directory information can retrieve the information from cache. If the findfiles should be cached, then every external change on the mirrored filesystem needs to be reported to the kernel. FindFilesWithPattern is in generally not a bad idea, it simplifies the implementation. The developer don’t needs to check the mask for *

Thanks

Marinko

Corillian commented 8 years ago

Ok after looking through FastFat it looks like it's up to the driver to do its own synchronization - which is what I had assumed would be the case. FastFat obviously doesn't need to do any caching because it has direct access to the drive but it does maintain the state of the previous search. It looks like FindFilesWithPattern() is worth keeping around however I do think the cache should be maintained in the kernel and not in the user-mode driver.

Corillian commented 8 years ago

Here's the FastFat code for searching:

//
        //  Determine where to start the scan.  Highest priority is given
        //  to the file index.  Lower priority is the restart flag.  If
        //  neither of these is specified, then the Vbo offset field in the
        //  Ccb is used.
        //

        if (IndexSpecified) {

            CurrentVbo = FileIndex + sizeof( DIRENT );

        } else if (RestartScan) {

            CurrentVbo = 0;
            Ccb->OffsetToStartSearchFrom = 0;

        } else {

            CurrentVbo = Ccb->OffsetToStartSearchFrom;
        }

As @marinkobabic mentioned there is currently a problem with refreshing the cache. As far as I'm aware Dokan does not currently provide a driver with a way to notify the file system that the file system has changed?

Liryna commented 8 years ago

I agree that the cache should probably review/rewrite and move to the kernel. Notification of file changes are made during CreateFile and SetInformation when I file is created, delete, .... I would say that this part works because for exemple we see explorer being directly aware of a new file and show it https://github.com/dokan-dev/dokany/blob/84c38f8eb48ee703fd3eaa18abed09ab1072d029/sys/create.c#L1300 https://github.com/dokan-dev/dokany/blob/b845c086c96e4b6125d0c729086ca122bb8b393c/sys/fileinfo.c#L598

Corillian commented 8 years ago

Yes but say your driver is a remote FTP server - when a new file gets created on that server there's no way to notify Dokan and Windows Explorer won't be aware of the new file until the next time it happens to issue a request for a directory listing.

Liryna commented 8 years ago

Yes, in such case we will have an issue :cry: but the cache would also not help ?

Corillian commented 8 years ago

The cache helps but it solves a separate problem - the OS can submit multiple partial requests for the directory listing of an identical search result. The cache won't be purged until different search criteria are provided, thereby invalidating the cache, or the SL_RESTART_SCAN flag gets set.

If a the file system changes out from underneath Dokan, such is possible when monitoring a remote FTP, the user-mode driver needs a way to tell Explorer than the file system has changed so that Explorer knows to issue a new FindFiles() with SL_RESTART_SCAN.

marinkobabic commented 8 years ago

I would suggest to create a separate issue related to cache. This should not be related to your actual implementation?

Corillian commented 8 years ago

Yeah I'm not moving any of the cache stuff to the kernel for this issue, I'll create a separate issue.

Corillian commented 8 years ago

In setfile.c:

case FilePositionInformation:
    // this case is dealed with by driver
    status = STATUS_NOT_IMPLEMENTED;
    break;

Which driver is that referring to?

Liryna commented 8 years ago

This code should never be called because FilePositionInformation is directly handled in the dokan driver and never forwarded to user land https://github.com/dokan-dev/dokany/blob/b845c086c96e4b6125d0c729086ca122bb8b393c/sys/fileinfo.c#L134

Corillian commented 8 years ago

Ok cool, ty

Corillian commented 8 years ago

I'm getting close to being able to submit a PR for the initial review of the new async IO code however there seems to be an issue with the driver. The kernel mode driver is sending requests for ReadFile after calls to Close() and Cleanup() have been sent. Mirror was handling this by checking if the context was valid and if not using the path to the file but this is obviously wrong. @Liryna any idea why this is happening?

Liryna commented 8 years ago

@Corillian This is unfortunately a normal behavior as I know :'( the kernel does request read or write after cleanup and close in some cases. You can see with filespy the irp made by the kernel and you should see the read irp made after cleanup and close irps.

I have no access to my computer for now but probably there is some informations about this behavior on Microsoft irp read documentation?

For information, we plan to release 1.0.0 as it is with community feedback on the current RC so next features will be added in a develop branch.

Corillian commented 8 years ago

Alright sure thing, thanks!

Corillian commented 8 years ago

@Liryna looking at FastFat this behavior does not seem to be accounted for which makes me think that Dokan is doing something wrong and that's why the OS is sending read/write's after close/cleanup.

Corillian commented 8 years ago

Ok I think I found the problem. From the documentation for IRP_MJ_CLEANUP:

Receipt of the IRP_MJ_CLEANUP request indicates that the handle reference count on a file object has reached zero. (In other words, all handles to the file object have been closed.) Often it is sent when a user-mode application has called the Microsoft Win32 CloseHandle function (or when a kernel-mode driver has called ZwClose) on the last outstanding handle to a file object.

It is important to note that when all handles to a file object have been closed, this does not necessarily mean that the file object is no longer being used. System components, such as the Cache Manager and the Memory Manager, might hold outstanding references to the file object. These components can still read to or write from a file, even after an IRP_MJ_CLEANUP request is received.

MirrorCleanup() is closing the file handle when it should in fact be cleaned up in MirrorCloseFile(). I'm seeing this in the IRP's:

  1. IRP_MJ_CLEANUP
  2. IRP_MJ_READ
  3. IRP_MJ_CLOSE

Which is exactly what the documentation for IRP_MJ_CLEANUP says can happen.

Liryna commented 8 years ago

@Corillian Thank you for pointing the part of documentation showing it! Still has not been able to reach my computer to make the research :(

So does everything is now always for you on the dokan behavior?

Corillian commented 8 years ago

So does everything is now always for you on the dokan behavior?

I have no idea what you just asked lol. I've submitted the initial PR for review: https://github.com/dokan-dev/dokany/pull/307

It's going to need a lot of testing and review.

Liryna commented 8 years ago

@Corillian Thank you corillian! I will take a look as soon as I can. Probably this weekend :)

Hahaha yes that was a strange sentence. I just wanted to know if you still think there is an issue on dokan behavior with read and write after cleanup.

Corillian commented 8 years ago

no I believe I fixed it, so far I haven't encountered any additional problems.

andreas-eriksson commented 7 years ago

Is this still awaiting a review? Can I do anything to help test it?

Liryna commented 7 years ago

Hi @andreas-eriksson ,

There is a security file issue https://github.com/dokan-dev/dokany/pull/307#issuecomment-257590762 that need to be fixed. @Corillian is very busy often so this can take time or someone could probably contribute to his PR the fix.

After this step, we need people to test the mirror for being sure that everything is still working properly. And finally I will help @Corillian for updating the document before merge 👍

Rondom commented 7 years ago

I think a way forward would be for someone to resolve the merge-conflicts with the master-branch. Then we can fix the file-security issue and then do extensive testing.

I would be willing to modify the API to be able to use both the old API and the new API. This would simplify testing and would allow people to test the new architecture without having to modify their filesystem. We can mark the new API this PR introduced as an experimental API until it is stable.

I tried to merge with master but gave up because I was not really sure how to resolve some of the conflicts :-( I did not try for too long, so maybe it was just me so I don't want to discourage anyone from giving it a try.

Liryna commented 2 years ago

This is now partially available in 2.0.0 thanks to @Corillian PR The only part I see that could be useful to merge is the async possibility for the user FS by returning status pending and call the dokan API later to complete when possible.