Open jetwhiz opened 7 years ago
This (or something very similar) also happens when running CrystalDiskMark on a CryFS filesystem. CryFS uses dokany-fuse. Steps to repro:
Mount a CryFS filesystem:
cryfs.exe basedir J: -f -o debug
(note that you need to use CryFS from this commit or any release after 0.10.2 (if released by the time you're trying to reproduce), otherwise you'll run into another issue that was CryFS's fault).
Run CrystalDiskMark on it. I used CrystalDiskMark 5.5.0 x64 with the settings: 1 iteration, 1GiB file size, on drive J:, then click the "4K" button in the last row to start the benchmark.
Observe the file system crash after the benchmark completed.
Further info:
-- -s
(which forwards the -s
to DokanY), i.e. running it in single threaded mode, then the error disappears. So it might be a race condition. When, however, running it without -s
but adding a global mutex to the CryFS code that ensures that no two CryFS file system operations can run at the same time, then the error still happens. So it seems to be a race condition in the DokanY code, not in the CryFS code.Log contents:
This is what happens at the time of crash. The lines containing [cryfs]
are logs from CryFS that are triggered at the start an end of each CryFS operation, i.e. right after DokanY calls into CryFS and right before it returns to DokanY.
###GetFileInfo 0018
GetFileInfo: : \CrystalDiskMark09B2EDEB\CrystalDiskMark09B2EE18.tmp
[2019-06-01 19:47:25.270] [cryfs] [debug] getattr("/CrystalDiskMark09B2EDEB/CrystalDiskMark09B2EE18.tmp", _, _)
[2019-06-01 19:47:25.271] [cryfs] [debug] getattr("/CrystalDiskMark09B2EDEB/CrystalDiskMark09B2EE18.tmp", _, _): success
result = 0
FileInternalInformation
DispatchQueryInformation result = 0
###Cleanup 0018
Cleanup: \CrystalDiskMark09B2EDEB\CrystalDiskMark09B2EE18.tmp
[2019-06-01 19:47:25.275] [cryfs] [warning] flush("/CrystalDiskMark09B2EDEB/CrystalDiskMark09B2EE18.tmp", _)
[2019-06-01 19:47:25.275] [cryfs] [warning] flush("/CrystalDiskMark09B2EDEB/CrystalDiskMark09B2EE18.tmp", _): success
[2019-06-01 19:47:25.276] [cryfs] [debug] release("/CrystalDiskMark09B2EDEB/CrystalDiskMark09B2EE18.tmp", _)
[2019-06-01 19:47:25.276] [cryfs] [debug] release("/CrystalDiskMark09B2EDEB/CrystalDiskMark09B2EE18.tmp", _): success
[2019-06-01 19:47:25.277] [cryfs] [debug] unlink("/CrystalDiskMark09B2EDEB/CrystalDiskMark09B2EE18.tmp")
###Close 0018
###Create 0019
CreateFile: \
DesiredAccess: SYNCHRONIZE|FILE_READ_DATA
Close: \CrystalDiskMark09B2EDEB\CrystalDiskMark09B2EE18.tmp
Stacktrace: This is the stacktrace: https://snag.gy/d5pinJ.jpg
Note that, a few frames up in the stack, the file info is invalid: https://snag.gy/cNUwuv.jpg
Environment
Check List
Description
I'm experiencing an intermittent issue when running the CrystalDiskMark benchmark software on encfs4win, where sometimes the write portion of the test will fail (and sometimes crash encfs4win). I think it might be a problem for the order that cleanup/close and createfile are happening.
It looks like the benchmark software does a Read -> Close -> Re-open -> Read, etc.
Failure:
Success:
Since the problem happens intermittently, that makes me think it is a race condition. You can see that when it fails, the order is "CleanUp -> CreateFile -> Close" and when it's a success the order is "CleanUp -> Close -> CreateFile".
I think the "Close" after the "CreateFile" is closing the new CreateFile (and not the old) since the second CreateFile is being called before "CleanUp -> Close" has completed. Do you think this is a possibility?
As you mentioned in the previous discussion (https://github.com/dokan-dev/dokany/issues/264#issuecomment-261724878), this could be related to the FUSE API race condition here: https://github.com/dokan-dev/dokany/blob/4f1d82fe80b574d81dabb59debcfbf1972e4e4b1/dokan_fuse/src/fusemain.cpp#L392