Closed lostmsu closed 1 year ago
Hi @lostmsu ,
are you able to reproduce with the native C mirror ? Usually when the mount disappears it means a userland request was taking too long to process a request (did not response) and the kernel force unmount the drive to avoid possibly hanging the system too. But in that case you should also see the mirror unmounting and not thinking it is still alive which is very weird.
Hi @Liryna , just tried with the native mirror, and it does not reproduce.
Interesting, are you able to attach a debugger and see the running threads just after the device unmount ? It would be great if it shows the hanging thread / request.
@Liryna it does not look like timeout. In the game log, the permission denied message appears slightly over 1 second after a message about successful initialization.
Unfortunately I don't have game source code, so I can not pinpoint the moment when the game hits the error.
Even funnier thing. Turns out when I close the game after the mount already disappeared from Explorer, the game is still able to write log file to Original
through Mirror
I am guessing via previously open handle.
I just thought to look into Event Viewer/System and found 2 entries around the time of disappearance from dokan2 saying:
Warning: no matching IRPs found for reply.
And then nothing until I initiated mirror shutdown with Ctrl+C, and there does not seem to be any errors or warnings.
FYI I originally experienced similar behavior with another IDokanOperations
implementation, but switched to reproducing in Mirror as that's what other issues requested.
Is there something I could do short of debugging the driver? Perhaps any guesses, like an unhandled exception?
By using Procmon you could see the status returned to the system for all the requests and find the one returned with a timeout. Then the next step would be to trace it on user land and see what it is happening
Still investigating, but heads up.
This is from native mirror (that works):
This is from dotnet one (that does not):
The insufficient resources is most likely the query that timeout on dotnet side. Should find out if it is received and why not processed/ returned with a status in time.
I am debugging it, and I see that the driver returns that error very quickly.
Here's what I did:
I opened the DokanNet.sln, and wrapped the contents of ZwCreateFileProxy
in Stopwatch
. E.g.
log("enter ZwCreateFileProxy");
var time = Stopwatch.StartNew();
try
{
... // the original try-catch block
}
finally
{
time.Stop();
logger.Debug("CreateFileProxy : {0} Time : {1}ms",
rawFileName, time.ElapsedMilliseconds);
}
Then run the game as described above. The path (directory) of interest is "currently-playing-background" as seen in the screenshot.
All the calls for that path get a response (e.g. finally block is called every time), the result is Success
, and the maximum time value I see for that directory is 17ms, and yet according to Procmon the call to create the directory fails with INSUFFICIENT RESOURCES like mentioned above.
Also, the directory in question as well as couple of others gets a call to SetDispositionInformationEx(FILE_DISPOSITION_POSIX_SEMANTICS | FILE_DISPOSITION_DELETE)
.
OK, I figured it out. The issue has nothing to do with the driver, hangs, or instability. Factorio actually deletes the mount directory for some reason when it encounters a certain error and that causes Windows to detach the virtual FS.
Now the error has to do with magic "." and ".." entries in FindFiles
that native mirror returns, and .NET mirror does not.
I feel like either .NET version needs to make a breaking change and add them automatically for non-root directories, or DokanOperationProxy
should check the list of returned files for presence of "." and ".." when path is not \
and if they don't appear, make some loud warning so that developers would see that their implementation is incorrect.
Looks like https://github.com/dokan-dev/dokan-dotnet/commit/7fa8ca4dd709d93155ac0915aefbd7977fbd413e got lost at some point so https://github.com/dokan-dev/dokany/issues/131 regressed.
Nice catch! It was moved to the native library to avoid having all the wrappers introducing the samelogic https://github.com/dokan-dev/dokany/blob/c4014d846c4092735afff21b1fd8eefc04d44ef8/dokan/directory.c#L465 Maybe the issue is in C# mirror that adds them for the root ? https://github.com/dokan-dev/dokany/blob/c4014d846c4092735afff21b1fd8eefc04d44ef8/samples/dokan_mirror/mirror.c#L824-L831
Does EnumerateFileSystemInfos
includes them ? If yes, we should remove them like native mirror does https://github.com/dokan-dev/dokan-dotnet/blob/137dbdd6162286b31f89eaa949fdcc2c5d82be26/sample/DokanNetMirror/Mirror.cs#L727
No, EnumerateFileSystemInfos
does not return the magics.
Is it possible that the native library bundled with .NET lib is old and does not include these items?
Another thought that I had after a quick skim is maybe that code path in the native library is not actually getting invoked in certain scenarios: it seems suspicious to me, that Procmon shows Result NO MORE FILES
and the only call to AddMissingCurrentAndParentFolder
is guarded by if (Status == STATUS_SUCCESS)
. Could also be a patterns thing, list order, or something weird along the lines.
AFAIK native mirror returns these magics because it uses FindFiles.
If the native code you mentioned would be working, then my explicit addition of .
and ..
would be a no-op, however it does resolve the issue with Factorio.
If the native code you mentioned would be working, then my explicit addition of . and .. would be a no-op, however it does resolve the issue with Factorio.
And thanks for finding the source of the issue and the solution!
It is true that the Win32 can return NO_MORE_FILES
but in our case as we don't have a buffer limit, we can return the whole listing without having to split them by batch. Therefore an implementation returning NO_MORE_FILES
is invalid for me and this is not the case as C# Mirror only return success.
https://github.com/dokan-dev/dokan-dotnet/blob/master/sample/DokanNetMirror/Mirror.cs#L408 https://github.com/dokan-dev/dokan-dotnet/blob/master/sample/DokanNetMirror/Mirror.cs#L747
I believe what is wrong is that the .
and ..
are append at the end of the listing which could confuse the app as they are normally always at front. (This can be reproduced with cmd.exe dir
, the .
..
are currently last).
https://github.com/dokan-dev/dokany/commit/b41d0226c8137dc4051c31e9f54323c59ff5bc82
Are you cappable to just rebuild the native library and use it in C# to confirm it fixed the problem ?
Otherwise I believe you can reproduce the issue with the native sample by changing
https://github.com/dokan-dev/dokany/blob/c4014d846c4092735afff21b1fd8eefc04d44ef8/samples/dokan_mirror/mirror.c#L825 to BOOLEAN rootFolder = TRUE;
BOOLEAN rootFolder = TRUE;
breaks it in native mirror, and https://github.com/dokan-dev/dokany/commit/b41d0226c8137dc4051c31e9f54323c59ff5bc82 then fixes it
Awesome, Thanks for the confirmation! 🏆 I believe this can be closed now and waiting for a next release. Let me know otherwise
TL;DR;
I launch DotNetMirror to mirror
Original
intoMirror
folder. It shows in Explorer as a mount, I can open it.Launch a 3rd party game with its data path set to a subfolder of
Mirror
folder.By the time the game finishes loading, the mount disappears together with the NTFS directory where it was mounted to. Later game as part of its functioning recreates the directory as a regular NTFS directory (I suspect it just sees mount point gone).
A potential clue from the game is
Repro steps
Original
anywhere on existing NTFS volume.Original
: config.txtconfig.ini
write-data=
to point to the full path toMirror\UserData
(might need to precreateOriginal\UserData
)DokanNetMirror
to mirrorOriginal
intoMirror
(I did NOT useunsafe
)Mirror
hasconfig.ini
withwrite-data=
pointing toC:\Path\To\Mirror\UserData
factorio.exe --config C:\Path\To\Mirror\config.ini
Mirror
directoryWhat happens
As mentioned above, at some point the mount just disappears, and the only clue is the error from the game console output.
The game will shortly recreate it as a regular directory, but the
Original
will no longer be updated.DokanNetMirror continues to work at this point, and no errors are displayed. Actually, there are no more logs from it from this point on. The log just stops until I Ctrl+C it at which point it prints
Success
and quits as if everything is OK.This one is from DokanNetMirror when I do above steps (instead of
UserData
I useFactorio
but that should not matter): mirror.log