winfsp / winfsp

Windows File System Proxy - FUSE for Windows
https://winfsp.dev
Other
7.06k stars 513 forks source link

Request For Comments: .NET API #82

Closed billziss-gh closed 7 years ago

billziss-gh commented 7 years ago

I am looking for feedback on the proposed .NET API that can be found in src/dotnet.

This API is structured as follows:

The containing assembly is named winfsp-msil.dll and is stored in the bin directory under the WinFsp installation directory. It is strongly named with an AssemblyVersion of 1.0.0.0, but is not placed in the GAC. The AssemblyVersion will remain the same for future WinFsp 1.x.y versions and would only be changed to 2.0.0.0 for WinFsp 2.x.y versions.

[WinFsp versioning follows a simplified variant of semver. I researched alternative arrangements where the assembly would be strong named with an AssemblyVersion that equals the WinFsp version and then placed in the GAC, but ended up discarding this scheme because of the number of publisher policies that would have to also be placed in the GAC for this scheme to be semver compatible.]

An example .NET file system using this API can be found in tst/passthrough-dotnet.

Your feedback is welcome. I note that my .NET knowledge is limited at this time.

brunoklein99 commented 7 years ago

I don't think implementors should inherit from a class, because this makes it really hard to test. I would expect them to provide an instance of an interface.

A reasonable way to do this would be to create a FileSystemHost that inherits from FileSystem and receives a IFileSystemOperations as constructor parameter, then the FileSystemHost could just forward the calls to the interface.


On another note, I was not able to test the .NET port, even after installing with Core and Developer options I am getting an exception. I noticed the current installer in the releases page doesn't install winfsp-msil.dll, I had to compile it from source.

Unhandled Exception: System.TypeInitializationException: The type initializer for 'Fsp.Interop.Api' threw an exception. ---> System.EntryPointNotFoundException: cannot get entry point FspFileSystemMountPointF
   at Fsp.Interop.Api.GetEntryPoint[T](IntPtr Module) in G:\Source\winfsp\winfsp\src\dotnet\Interop.cs:line 824
   at Fsp.Interop.Api.LoadProto(IntPtr Module) in G:\Source\winfsp\winfsp\src\dotnet\Interop.cs:line 837
   at Fsp.Interop.Api..cctor() in G:\Source\winfsp\winfsp\src\dotnet\Interop.cs:line 882
   --- End of inner exception stack trace ---
   at Fsp.Service..ctor(String ServiceName) in G:\Source\winfsp\winfsp\src\dotnet\Service.cs:line 36
   at passthrough.PtfsService..ctor() in G:\Source\winfsp\winfsp\tst\passthrough-dotnet\Program.cs:line 700
   at passthrough.Program.Main(String[] args) in G:\Source\winfsp\winfsp\tst\passthrough-dotnet\Program.cs:line 846
billziss-gh commented 7 years ago

I don't think implementors should inherit from a class, because this makes it really hard to test. I would expect them to provide an instance of an interface.

A reasonable way to do this would be to create a FileSystemHost that inherits from FileSystem and receives a IFileSystemOperations as constructor parameter, then the FileSystemHost could just forward the calls to the interface.

I agree that when one has a well defined set of operations that describe a universal (or near-universal) behavior/concept one should use an interface. This is even more important in .NET because it only supports single inheritance.

However in this case I expect that WinFsp will support new file system operations in the future. For example, it is likely that extended attribute support will be added, so we will need operations like GetEA, SetEA, etc. (getxattr, setxattr, etc. on FUSE). Likewise at a different point in time hard link support will be added, so we will need operations such as CreateLink (link on FUSE).

There is the problem then of versioning an interface. I understand that in .NET at least this is not allowed. Once an interface has been published no further methods can be added to it. So we would have to resort to IFileSystemOperations2, IFileSystemOperations3 and so on. IFileSystemOperations2 could inherit from IFileSystemOperations, IFileSystemOperations3 could inherit from IFileSystemOperations2 and so on.

Alternatively we could break the interface down into parts that are not expected to change. This is the approach @yogendersolanki91 has taken in his port, where he defines MinimalOperation, ReparseOperation and StreamOperation interfaces: see here. But now we have the problem that the FileSystemHost class has to take 3 interfaces as constructor parameters and perhaps more in the future.

Another problem is that interfaces require full implementation. OTOH you can have useful file systems which implement a very small number of operations. For example, a read-only file system would have to also implement MinimalOperation.Write simply to return STATUS_INVALID_DEVICE_REQUEST. This would drive us to define ever smaller interfaces, almost down to a single file system operation.

For these reasons I have chosen the simplest approach: require that implementors inherit from a class and only override the methods that they need to implement. This makes versioning relatively easy: we can simply add new virtual methods (GetEA, SetEA, etc). as they become available and existing clients will not break.

The only drawback that I see is that requiring inheritance from a particular base class means that an arbitrary class cannot be made to exhibit "filesystem" behavior, by implementing an IFileSystemOperations interface. This is what interfaces are good at, allowing random software components exhibit interesting behaviors. IMO this makes more sense for universal behaviors like serialization that potentially all (or at least many) classes may exhibit. It makes less sense for a specialized concept like file system implementation.

I don't think implementors should inherit from a class, because this makes it really hard to test.

Can you please elaborate on this point. I do not quite understand it.

On another note, I was not able to test the .NET port, even after installing with Core and Developer options I am getting an exception.

The current installer (v1.0 aka "WinFsp 2017") does not include any .NET functionality and does not contain the winfsp-msil.dll. As you have found, you will have to compile from source.

You will also have to recompile the WinFsp native DLL's (the FSD has not changed and you can reuse the v1.0 signed driver). The reason for the DLL change is that I had to add a few inline functions from winfsp.h as out-of-line functions so that .NET can use them easily.

For example, the FspFileSystemMountPoint inline function, now also exists as an out-of-line function with name FspFileSystemMountPointF.

billziss-gh commented 7 years ago

A reasonable way to do this would be to create a FileSystemHost that inherits from FileSystem and receives a IFileSystemOperations as constructor parameter, then the FileSystemHost could just forward the calls to the interface.

Thinking a bit more about this: an approach that might work best is to follow your suggestion of a FileSystemHost that takes a FileSystemOperations (or FileSystemBase) class instance (not an interface instance). Then we can decouple the file system hosting aspect from the file system implementation, but not suffer any of the versioning problems that interfaces bring.

I don't think implementors should inherit from a class, because this makes it really hard to test.

Can you please elaborate on this point. I do not quite understand it.

I think I am now getting this. You would like to be able to write your own TestFileSystemHost that can be used to test individual file system operations. Is this correct?

If yes, this pattern could be supported by both an IFileSystemOperations interface as well as a FileSystemBase class.

brunoklein99 commented 7 years ago

I don't think implementors should inherit from a class, because this makes it really hard to test.

Can you please elaborate on this point. I do not quite understand it.

I stated this because it's inconvenient to have tests having dependencies or behavior unrelated to them, for instance If I were to start creating my file system now, my tests would fail because the dll is missing an exported function, which would probably be totally unrelated to my test.

I think I am now getting this. You would like to be able to write your own TestFileSystemHost that can be >used to test individual file system operations. Is this correct?

Yes, that is correct, but we do it a bit differently in .NET land, we usually heavily use Inversion of Control/Dependency Injection and frameworks like Moq. If it was an abstract class, or a class with virtual methods it would be just as appropriate.

An implemented file system could be tested like this (simplified example):

public class MyFs : IFileOperations
{
    private readonly IRealFsAbstraction _realfs;

    public MyMs(IRealFsAbstraction realfs)
    {
        _realfs = realfs;
    }

    //implementation from IFileOperations
    public void CreateFile(string filename)
    {
        _realfs.CreateFile(filename);
    }

    [...]
}

public void TestCreateFile()
{
    var mock = new Mock<IRealFsAbstraction>();

    var fo = new Myfs(mock.Object);

    fo.CreateFile("test");

    mock.Verify(x => x.CreateFile("test"));
}

This way I can unit test just my code and it's logic/behavior absolutely decoupled from anything.

By as I said, if you go the abstract class route, instead of an interface, it would suffice and have the same effect.


Visual Studio insists I don't have the Kernel Mode Tools installed, would you be able to make the dll's available somewhere (maybe in the repo's readme), already with FspFileSystemMountPointF, so I can use them with the .NET passthrough?

billziss-gh commented 7 years ago

By as I said, if you go the abstract class route, instead of an interface, it would suffice and have the same effect.

I think you have me convinced that there is merit in decoupling the file system hosting functionality from the operational functionality. I will split the FileSystem class to FileSystemHost and FileSystemBase later tonight. For FileSystemBase I will use a concrete base class that minimally implements all file system operations (by returning STATUS_INVALID_DEVICE_REQUEST) to allow for partial implementation and to avoid the versioning issues discussed earlier.

would you be able to make the dll's available somewhere

I will try to make the DLL's available later tonight or tomorrow (I am on PDT timezone).

billziss-gh commented 7 years ago

Commit 1ee563cd13290c1a8fa11bdea89ec5eb60f70164 splits the FileSystem class into FileSystemHost and FileSystemBase.

I also attach winsp.dll.zip which contains the native DLL's winfsp-x64.dll and winfsp-x86.dll. These DLL's contain the missing symbols required by winfsp-msil.dll.

billziss-gh commented 7 years ago

There is now a prerelease containing the new .NET layer and other changes here:

https://github.com/billziss-gh/winfsp/releases/tag/v1.1B1

billziss-gh commented 7 years ago

There is now another prerelease in: https://github.com/billziss-gh/winfsp/releases/tag/v1.1B2

This pre-release contains a new sample file system (memfs-dotnet) that utilizes/exercises the new .NET layer. This file system is fully tested with winfsp-tests and Microsoft's own ifstest. Because of this testing I feel fairly confident that the .NET layer is stable and works well.

I therefore consider the .NET layer close to being baked and will likely be making a final v1.1 release in the next few weeks. Let me know if you have any contradictory feedback.

FrKaram commented 7 years ago

Hi Bill, first of all, congrats on WinFSP, I can only imagine how difficult it must have been to achieve making such a program. I'm currently only trying to understand how all this is working before writing my own filesystem in C#. I've made previous tests with dokany (first reference I got with Google when searching "User mode file system" ) and I'm now investigating WinFSP. I've been able to run your compiled version of "memfs-dotnet-msil.exe" and I must say it is working great as far as my tests went. I got an issue now when trying to use a version I compiled myself. When I run the program, I get the "The service MemfsService has been started." message but I cannot see the drive that has been mounted. Am I missing something here ?

billziss-gh commented 7 years ago

@FrKaram thanks for your kind words re: WinFsp.

The most likely reason that you are seeing this is that you are launching the file system as an administrator and trying to access it as a user or vice-versa. This is not possible on Windows and it is an explicit design decision where the drive "namespace" of a user who is elevated to administrator is not the same as the "namespace" of the same user when not an administrator. See DefineDosDeviceW and this link.

Try launching memfs-dotnet with a command similar to the one below from a non-Administrator command prompt. You will then be able to access drive Y: from Explorer.

> memfs-dotnet-msil.exe -t -1 -i -F NTFS -m Y:

This instructs memfs-dotnet to launch a file system with:

FrKaram commented 7 years ago

Well, that was it ! I can follow on my tests / implementation Thanks !

billziss-gh commented 7 years ago

@FrKaram I am glad it worked. Let me know if you have any other questions.

FrKaram commented 7 years ago

Hi Bill, I'm still writing my FileSystem implementation based on WinFSP. Hopefully, you wrote MemFS which is of a great help to understand how things should be working.

You state in WinFSP description that there is "FUSE compatibility layer for native Windows and Cygwin". Is the opposite true also ? Will I be able to port the C# FS I'm writing to FUSE ? I see that there is mono binding for FUSE : https://github.com/jonpryor/mono-fuse

billziss-gh commented 7 years ago

@FrKaram wrote:

You state in WinFSP description that there is "FUSE compatibility layer for native Windows and Cygwin". Is the opposite true also ? Will I be able to port the C# FS I'm writing to FUSE ? I see that there is mono binding for FUSE : https://github.com/jonpryor/mono-fuse

There is indeed a FUSE compatibility layer that has been used to port successfully FUSE file systems. Partial list.

However there is no FUSE compatibility layer for C# at this time. It usually is relatively simple to port an existing FUSE binding to WinFsp-FUSE though.

FrKaram commented 7 years ago

I've implemented enough method so that I can create a directory at the root of the drive and list the contained files and folders.

Now, I'm stuck with something I do not understand. When I click on the "New Folder" sub-folder that is in the root folder, I see in the logs that the driver tries to Open the folder

memfsfsp[TID=38c8]: 4B3FE010: >>Create [UT----] "\New folder", FILE_OPEN, CreateOptions=0, FileAttributes=0, Security=NULL, AllocationSize=0:0, AccessToken=0000091C, DesiredAccess=100081, GrantedAccess=0, ShareAccess=7

but no call is made in my "Open" method. And then the explorer tells me that "New Folder" is innaccesible.

In fact, the only calls to "Open" are made for the root folder and no other folder.

Do you have any advice ? Thanks !

billziss-gh commented 7 years ago

@FrKaram the most likely reason is that you do not return the proper security descriptor to allow file/directory creation in the root directory.

When creating a file or directory, Windows rules require for the security descriptor of the parent directory to be checked. WinFsp uses the GetSecurityByName call to determine the file attributes and security descriptor for a particular file/directory. You are likely receiving this call for the root (\) directory, but you return a security descriptor that disallows file/directory creation.

This is why in memfs-dotnet I use this default security descriptor for the root directory:

https://github.com/billziss-gh/winfsp/blob/master/tst/memfs-dotnet/Program.cs#L221-L233

O:BAG:BAD:P(A;;FA;;;SY)(A;;FA;;;BA)(A;;FA;;;WD)

O:BA                Owner is Administrators
G:BA                Group is Administrators
D:P                 DACL is protected
(A;;FA;;;SY)        Allow full file access to LocalSystem
(A;;FA;;;BA)        Allow full file access to Administrators
(A;;FA;;;WD)        Allow full file access to Everyone

The specific rights tested during file creation are listed in this comment:

https://github.com/billziss-gh/winfsp/blob/master/src/dll/fsop.c#L164-L182

FrKaram commented 7 years ago

Maybe I was unclear (or maybe I do not understand your answer yet :-)), creating folders does work. I've been able to create a folder called "New Folder" (as I haven't implemented "rename", I cannot change it's name yet) It's when I try to enter into that folder that I do not see any call to any of the overriden methods in my code.

FrKaram commented 7 years ago

Hum, I see a GetSecurityByName on my "/New Folder" prior to entering the folder. I think I need more digging into all this

billziss-gh commented 7 years ago

Sorry I misunderstood.

In this case you will get a call GetSecurityByName on /New folder. Make sure you report that it is a directory (FILE_ATTRIBUTE_DIRECTORY) and a valid security descriptor. You can just use the security descriptor I sent you above as a generic file descriptor that will work for most purposes (although it provides no real security).

FrKaram commented 7 years ago

Thanks I found my mistake. I am returning STATUS_OBJECT_NAME_NOT_FOUND whereas the folder exist. There must be something wrong when requesting Folder info to the remote system

billziss-gh commented 7 years ago

@FrKaram I am glad that you have this figured out. Let me know if you face other issues.

FrKaram commented 7 years ago

Thanks Bill ! That was a path conversion issue ... Solved it ! Directory browsing is working Going forward till next issue :-)

FrKaram commented 7 years ago

@billziss-gh Hi. I'm making progress on my file system. Even though it's going quite slowly due to other priorities. I'm currently working on the writing part based on what you did for mem-fs.

I would like to avoid writing bytes to the repository each time the Write method is called but instead store the modifications and apply them at once when all has been written.

So I'm in seach of an event that I could watch and that would act as a trigger for "finished writing". I noticed that "Flush" is not called each time so I cannot rely on that. Do you think I can rely on "Cleanup" ? Will it be called at the end of the writting ?

billziss-gh commented 7 years ago

@FrKaram Cleanup can be used for this purpose, but you must also be ready to do this on Close.

[The actual Flush operation is only called whenever an application calls FlushFileBuffers. Flush is equivalent to fsync in POSIX/FUSE.]

Cleanup is called whenever an application calls CloseHandle. This means that you will receive Cleanup for every Create / Open (unless you enable the PostCleanupWhenModifiedOnly optimization) and you can safely "flush" your file system buffers there. As an optimization you can do the flushing only when you get one of the SetAllocationSize, SetArchiveBit, SetLastWriteTime bits, because this means that the last HANDLE to your files is being closed (and the file was modified).

However there is a gotcha: memory-mapped files. The following is legal on Windows:

f = CreateFile(...);
m = CreateFileMapping(f, ...);
CloseHandle(f);         // (1)
v = MapViewOfFile(m);

// memory-mapped accesses on v

UnmapViewOfFile(v);
CloseHandle(m);         // (2)

At (1) you will receive a Cleanup, but as you can see the file is accessed indirectly through the memory mapping (i.e you will get Read / Write calls after Cleanup). At (2) you will actually receive the Close for the file (assuming that the OS did not keep a reference to the file for its own purposes, which it may, especially in later versions of Windows).

For this reason I propose one of the following strategies:

  1. If your file system does not suffer from cache coherency issues then you should consider enabling the NTOS Cache Manager by setting FileInfoTimeout==-1. This can make your file system exhibit I/O performance similar to NTFS (once caches have been primed). This may also allow your file system to get away with no caching or minimal caching at the file system level (i.e. treat every Write as a flushing Write).

  2. If your file system cannot enable the Cache Manager, then you must flush at Cleanup and Close. Alternatively you can have a checkpoint thread that kicks in every few seconds and flushes the file system (this is what I do in secfs).

FrKaram commented 7 years ago

Thanks @billziss-gh, that's vey usefull 👍

I have another question :

For now, I cannot figure out how to make partial content update in my repository. So I need to post the complete file each time an update is to be made (that's why I would like to avoid doing it at each "Write" but instead "caching" the "Writes").

I guess that I cannot make the assumption that the file will entirely be "Read" before being "Written", can I ? If so, I do have, when I want to flush the file, to "download" the file entirely, apply the changes and repost it to the repository. This does seem optimised, do you have any advice ?

Moreover, are you aware of an WinFSP implementation for FTP ?

billziss-gh commented 7 years ago

I guess that I cannot make the assumption that the file will entirely be "Read" before being "Written", can I?

No, there is no such guarantee.

If so, I do have, when I want to flush the file, to "download" the file entirely, apply the changes and repost it to the repository.

I recommend reading this great writeup about the Andrew File System: http://pages.cs.wisc.edu/%7Eremzi/OSTEP/dist-afs.pdf

You may be able to adapt some of its ideas to your own file system.

Moreover, are you aware of an WinFSP implementation for FTP ?

I have not seen any straightforward implementations of FTP over WinFsp. I believe however that rclone supports it and it was recently ported to Windows and WinFsp.

FrKaram commented 7 years ago

I recommend reading this great writeup about the Andrew File System: http://pages.cs.wisc.edu/%7Eremzi/OSTEP/dist-afs.pdf

You may be able to adapt some of its ideas to your own file system.

This is interresting and it's quite close to what I did few years ago on a multi-master sync project. Still, they download the full file, work locally and flush to the server on Close. I need to complete the implementation and think about adding APIs on the repository to allow partial read/update I think.

FrKaram commented 7 years ago

One last question for now : are the API calls sequential or concurrent (ie do I need to implement thread-safety) ?

fsgeek commented 7 years ago

On Jun 2, 2017, at 1:14 AM, Francois Karam notifications@github.com wrote:

I recommend reading this great writeup about the Andrew File System: http://pages.cs.wisc.edu/%7Eremzi/OSTEP/dist-afs.pdf

You may be able to adapt some of its ideas to your own file system.

This is interresting and it's quite close to what I did few years ago on a multi-master sync project. Still, they download the full file, work locally and flush to the server on Close. I need to complete the implementation and think about adding APIs on the repository to allow partial read/update I think.

AFS2 downloaded the full file. AFS3 downloads chunks of the file. The OpenAFS project is still out there (http://openafs.org) and demonstrates how they do their data management, though it isn't a FUSE file system on Windows (not sure on Linux but the code is there).

I worked on AFS3 many years ago, including on adding variable chunk size management, which is why I'm aware of how they did it.

DCE/DFS did the same thing but with fine grained strong consistency guarantees (local POSIX semantics on a network file system). That code belongs to the Open Group though.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

FrKaram commented 7 years ago

AFS2 downloaded the full file. AFS3 downloads chunks of the file. The OpenAFS project is still out there (http://openafs.org) and demonstrates how they do their data management, though it isn't a FUSE file system on Windows (not sure on Linux but the code is there). I worked on AFS3 many years ago, including on adding variable chunk size management, which is why I'm aware of how they did it. DCE/DFS did the same thing but with fine grained strong consistency guarantees (local POSIX semantics on a network file system). That code belongs to the Open Group though.

Thanks for the info. I'll have a look at the updated spec.

billziss-gh commented 7 years ago

@FrKaram

One last question for now : are the API calls sequential or concurrent (ie do I need to implement thread-safety) ?

Since we are talking about the .NET API, this is controlled by the Mount parameter Synchronized:

https://github.com/billziss-gh/winfsp/blob/master/src/dotnet/FileSystemHost.cs#L246

        public Int32 Mount(String MountPoint,
            Byte[] SecurityDescriptor = null,
            Boolean Synchronized = false,
            UInt32 DebugLog = 0)

If you want your file system operations to be protected by an exclusive lock pass true for Synchronized.

billziss-gh commented 7 years ago

@fsgeek

AFS3 downloads chunks of the file.

Interesting. I was not aware of this. Are they closer to NFS semantics in AFS3 then or is that an optimization for specific cases?

fsgeek commented 7 years ago

Chunks permitted easier management of space. The original default was something like 64KB but we made it variable size (the original motivation was to allow memory based caching rather than disk based caching and back in 1989 64KB was a big chunk of memory!)

AFS3 semantics are different than NFS, in that writes were not through, but rather to the cache with asynchronous writeback. This was possible because AFS maintained the per-file callbacks (not so different than SMB oplocks in some ways).

billziss-gh commented 7 years ago

@fsgeek thank you.

billziss-gh commented 7 years ago

I will be releasing version 1.1 ("2017.1") some time this week. This version will include the new .NET layer as it currently stands. I am therefore closing this issue.

If you have any contradictory feedback please reopen this issue or open a new issue.