SpartanJ / efsw

efsw is a C++ cross-platform file system watcher and notifier.
Other
662 stars 101 forks source link

Incorrect move events reported for simultaneous file moves across monitored subdirectories on Windows #178

Closed solarispika closed 2 months ago

solarispika commented 3 months ago

Summary

efsw incorrectly reports move events when multiple file move operations occur simultaneously across different subdirectories within the monitored folder.

Expected behavior

efsw should correctly report the source and destination of each move event, even when multiple moves occur simultaneously.

Actual behavior

efsw sometimes reports incorrect source directories for move events, mixing up the origins of moved files.

Environment

Steps to reproduce

  1. Create a folder structure:

    path\to\be\monitored
    ├── 1
    │   └── init (empty file)
    └── 2
    └── init (empty file)
  2. Open three PowerShell windows:

    • Window 1: Run the monitoring script (efsw-test.exe)
    • Window 2: Navigate to path\to\be\monitored\1 and run the renaming script
    • Window 3: Navigate to path\to\be\monitored\2 and run the renaming script
  3. Observe the output in Window 1 for incorrect move events

Monitoring script (Window 1)

efsw-test.exe path\to\be\monitored | ForEach-Object {
 if ($_ -match 'from file (\d+)\\.*?to (\d+)\\') {
     $num1 = $matches[1]
     $num2 = $matches[2]

     if ($num1 -ne $num2) {
         Write-Output $_
     }
 }
}

Renaming script (Windows 2 and 3)

# Save the initial file name
$initialFileName = "init"  # Replace with your actual file name

# Function to generate a random file name
function Get-RandomFileName {
    $randomName = -join ((65..90) + (97..122) | Get-Random -Count 10 | ForEach-Object {[char]$_})
    return $randomName + [IO.Path]::GetExtension($initialFileName)
}

# Main loop
while ($true) {
    # Check if the file exists
    if (Test-Path $initialFileName) {
        # Generate a new random file name
        $newFileName = Get-RandomFileName

        # Rename the file
        Rename-Item -Path $initialFileName -NewName $newFileName
        Write-Host "Renamed $initialFileName to $newFileName"

        # Update the initial file name for the next iteration
        $initialFileName = $newFileName

        # Wait for a short time before the next rename
        # Start-Sleep -Seconds 1
    }
    else {
        Write-Host "File not found: $initialFileName"
        break
    }
}

Example of incorrect output

Watch ID 1 DIR (path\to\be\monitored\) FILE (from file 1\sJzyruHKOS.txt to 2\rqESBgvQFf.txt) has event Moved
Watch ID 1 DIR (path\to\be\monitored\) FILE (from file 2\AOqJoDMPir.txt to 1\IdSnzClqXH.txt) has event Moved

Possible cause

Each rename operation is reported as two events, FILE_ACTION_RENAMED_OLD_NAME and FILE_ACTION_RENAMED_NEW_NAME. efsw saves filename to watch->OldFileName on handling FILE_ACTION_RENAMED_OLD_NAME and later uses it on handling FILE_ACTION_RENAMED_NEW_NAME to combine both events as a Actions::Moved event. However, multiple FILE_ACTION_RENAMED_OLD_NAME might occur successively, before their corresponding FILE_ACTION_RENAMED_NEW_NAME, thus one of remove operation's old name saved in watch->OldFileName is overwritten.

https://github.com/SpartanJ/efsw/blob/0c70ed2cd069a18d534989e8ae1c25ee39cba1f1/src/efsw/FileWatcherWin32.cpp#L162-L164 https://github.com/SpartanJ/efsw/blob/0c70ed2cd069a18d534989e8ae1c25ee39cba1f1/src/efsw/FileWatcherWin32.cpp#L168-L212

Additional notes

SpartanJ commented 3 months ago

Thank you very much for reporting the issue. This is interesting, I did not expect it to happen and the Windows documentation does not clarify about it, this was implemented assuming that the event FILE_ACTION_RENAMED_NEW_NAME comes always right after a FILE_ACTION_RENAMED_OLD_NAME event (we could have other events types in between though). This is only fixable if and only if Windows reports the events in FIFO, if this is not the case this is a kernel bug and is un-fixable, so I'll assume that this is the case an implement it as a FIFO queue.

solarispika commented 3 months ago

I find this article which might explain why such behavior can happen. Basically, the function is not designed for file monitoring but for file listing from the beginning.

The intended purpose of the Read­Directory­ChangesW function is to assist programs like Windows Explorer which display the contents of a directory. If something happens that results in a change to the directory listing, then it is reported by Read­Directory­ChangesW. In other words, Read­Directory­ChangesW tells you when the result of a Find­First­File/Find­Next­File loop changes. The intended usage pattern is doing a Find­First­File/Find­Next­File to collect all the directory entries, and then using the results from Read­Directory­ChangesW to update that collection incrementally.

solarispika commented 3 months ago

I managed to fix the issue by ReadDirectoryChangesExW since it provides file id for comparing. However, the function needs newer OS and it only supports NTFS, this makes it less useful for this project.

SpartanJ commented 3 months ago

Could you share the changes? It's possible to detect the OS version and the disk file system and use this particular more fine-grained version.

solarispika commented 3 months ago

Sure, here it is. poc-fix-rename-issue.patch The patch contains debug logs and some experimental code, so don't be serious on it :-)

To use the patch, some macro is needed to set sdk version. I use cmake for building, and the command line argument for configuration has -DCMAKE_CXX_FLAGS_INIT="/DWINVER=0x0A00 /D_WIN32_WINNT=0x0A00".

With this patch, I can't reproduce the problem.

SpartanJ commented 3 months ago

Ohh I see, very interesting, thanks for sharing it, I'll try to provide the feature at least optionally. But as you can see, all these file system monitoring APIs are very messy and you'll encounter things like the one you encountered quite easily. efsw already tries to do more than most libraries to get it as close to "right" as possible.

SpartanJ commented 2 months ago

I implemented something based on your fix. ReadDirectoryChangesExW will be detected dynamically and used if available.

solarispika commented 2 months ago

It looks like file systems which ReadDirectoryChangesExW doesn't support will fail to be watched on Windows versions that support the function. I think monitored filesystem should also be considered. Maybe we can try running ReadDirectoryChangesExW first and see if it fail or not.

SpartanJ commented 2 months ago

Good observation, now ReadDirectoryChangesW is used as fallback. Thanks