dokan-dev / dokany

User mode file system library for windows with FUSE Wrapper
http://dokan-dev.github.io
5.2k stars 661 forks source link

"dotnet build" fails to run reliably when run from within a Dokany filesystem #1125

Closed hach-que closed 7 months ago

hach-que commented 1 year ago

Environment

Check List

Description

I have been trying to write a filesystem for the past week or so where the use case involved running "dotnet build" from within the filesystem; that is, both dotnet.exe, the .NET SDK and the project to build are all within the Dokany filesystem. Unfortunately over the course of doing this I started encountering subtle bugs where the loaded .NET assemblies would be corrupt or otherwise invalid. I then tried porting my efforts to WinFSP with no luck.

To isolate the issue, I started putting together a test suite that would test virtual filesystems using their official samples. Both Dokany and WinFSP fail in different ways; Dokany's failure case seems to manifest with incorrect data reads, but typically only the first time the data is read.

The test suite I have written is available at https://src.redpoint.games/redpointgames/msbuild-fstest and can be used to reproduce the issues. I've been able to reproduce the issues in a Windows 11 VM (so they're not environment specific), using both the C++ and .NET mirror samples. If you need instructions on how to set up a VM to reproduce the issues, see here (though you will want to install the Dokany driver instead of the WinFSP driver).

The latest results of the test suite are available online here: https://msbuild-fstest.redpointgames.dev/

image

image

Liryna commented 1 year ago

Hi @hach-que ,

Dokany's failure case seems to manifest with incorrect data reads, but typically only the first time the data is read.

How have you been able to determine it was a read issue ? Running procmon when it fails could help narrow the faulty workflow (with the sample logs).

Is it also flaky with the memfs sample ? (so far I got 10/10 with the mirror tests) If your VM has a small number of Threads, does allocating more improve the situation ?

hach-que commented 1 year ago

@Liryna I can replicate the issue on my own Windows machine, 4 build servers and a fresh virtual machine (the latter to eliminate any environment issues). The reason I suspect it's a read issue is the type of errors that dotnet build is emitting to the console - things like the loaded assembly is invalid, or that the executing program performed an invalid instruction (if the executable code loaded into memory from reading the file via Dokan wasn't intact, this is how it could manifest).

I wasn't able to get the memfs sample working at all - I simply get [2022-12-12 04:01:48.058] [error] dokan_memfs failure: Driver something wrong.

(Of note, #1110 looks rather suspicious; I'm not sure of the specifics with .NET loading assemblies into memory, but if it does use memory-mapped I/O, then the subtle corruption in #1110 would likely exhibit the behaviours I'm seeing)

hach-que commented 1 year ago

This comment (https://github.com/dotnet/runtime/issues/62391#issuecomment-986834876) suggests that the .NET runtime does use memory-mapped I/O for assemblies. Fixing #1110 first might be the best course of action.

boAtLimbic commented 1 year ago

@hach-que I'm also experiencing similar issue when compiling over NFS. I see you also are experiencing issues with WinSCP, which I have yet to probe.

I'm trying to compile over NFS on centralised storage with the Unreal Engine. I would like to collaborate, if you're interested.

Vinny636 commented 7 months ago

It's interesting that the .NET filesystems work but the native C++ ones do not. I wonder if the additional latency involved with marshalling between native and .NET is allowing your tests to pass?

Sorry that I'm commenting on an issue that's more than a year old, but I'm embarking on a filesystem project as well and I'm curious about failures when dealing with memory-mapped files.

hach-que commented 7 months ago

I ended up moving over to WinFSP, which has been far more reliable for writing Windows filesystems in user space. A similar issue that affected WinFSP was fixed within a few months with a pretty good response from the developer.

I'm going to close this issue as it doesn't look like it's going to be actioned in Dokany and I don't plan on switching away from WinFSP.

Liryna commented 7 months ago

Glad you were able to find a solution that works for you! Dokan is reliable and there are projects and enterprises that have been using it for years at large scale that proves it. This seems to be a niche issue that had a low priority in my TODO sorry. If anyone would like to take a look, I would be happy to give a hand.