ianhays commented 8 years ago

There is a prototype for a Polling FileSystem Watcher in CoreFXLab that we should look into bringing to CoreFX as an alternative to our current implementation.

Some benefits of polling:

Works on all systems for network drives
Completely reliable - no event buffer overflow causing dropped/failed events
Avoids system resource limitations e.g. inotify max user instances
Consistency is generally achieved (or attempted) with the existing FSW by greatly increasing the size of internal event buffer. The downside of this is that it is using memory that won't be paged out and the max size is 64kb. Polling avoids this issue.
Some downsides:
Doesn't scale well to very large directories e.g. an entire drive.
Generally considered to be less performant than the OS's watching APIs (though there are some factors that affect this like buffer size, watched directory size, and polling frequency).
Bringing it to CoreFX

There would be a few ways to bring the Polling FSW into CoreFX that we can consider:

Bring it in its own library System.IO.FileSystem.Watcher.Polling that is completely separate from the existing FSW
Put it into System.IO.FileSystem.Watcher but have it be its own separate class
Merge the implementation into the existing FSW and provide a bool constructor overload (e.g. bool poll = false) that allows specification of which implementation to use
Merge the implementation into the existing FSW and try to programatically determine which implementation to use. We could base our decision on the initial number of entries in the watched directory, for example.
Merge the implementation into the existing FSW and use both in unison. Use the OS's API's but also poll every now and then to catch things that the OS's FSW may have missed.
Replace our existing implementation with a polling implementation
Examples of demand for a polling FSW:

@KrzysztofCwalina @stephentoub @sokket @joshfree @chcosta

stephentoub commented 8 years ago

Completely reliable - no event buffer overflow causing dropped/failed events

It may be more reliable in that sense, but it's less in others. For example, you're only going to be polling on some kind of interval, which means change notifications could be delayed for up to whatever that interval is, and changes that happen within that interval and are then reverted within that interval may go unnoticed.

Avoids system resource limitations e.g. inotify max user instances

It may avoid some system resource limitations, but it could stress others. For example, some changes to a file may not affect that file's last accessed/modified times, so for such things you need to store all of the relevant details in memory in order to compare on the next polling. Further, if you need to be able to detect changes that occur to the contents of the file itself, even if modification times aren't updated, then you may need to read through the entire file to know whether it's changed, leading to significantly more disk reads, network accesses, etc., plus more data stored in memory.

I'm in no way saying we shouldn't consider this, just pointing out that it's not as rosy as suggested.

jonmill commented 8 years ago

There are also tons of nuances in determining what kind of change occurred; for instance, we will have to be on the lookout for circular link references, A-B file renames (ex; two files A.txt and B.txt and the user renames A.txt to B.txt and B.txt to A.txt), and snapshotting the whole watch directory has it's own set of problems.

We had to tackle these in OneDrive when I was over there and there are lots of edge cases and pitfalls. We need to design this carefully if we decide to go this route

KrzysztofCwalina commented 8 years ago

Yeah, it would be good to write down the guarantees and caveats of these two FWS alternatives.

As to the design discussion points, I would start with simply productizing the polling watcher without trying to unify them. I am not a big fan of the boolean/switch selecting the implementation. I think it would not really work well as the APIs are slightly different (for good reason), and also it would require changes to system.dll as opposed to being a standalone nuget package.

ianhays commented 8 years ago

The nice thing about the polling FSW is that we don't have to rely on the OS's FSW APIs so we can determine for ourselves which events are valid and which should be watched for and which should be filtered. The downside is that there are a ton of scenarios where we're not going to be able to effectively capture an event without a huge amount of overhead. By catching changes by comparing the current state to the previous state, we miss a ton of potential events. Some examples of things we'll miss if they occur between timer ticks:

Creating and deleting a file
Deleting and recreating a file
Moving a file to a different directory and creating a new file with that file's old name.
Appending a file then removing that appended data resulting in the size not being changed.
Writing to a file then setting the lastwritetime to the previous write time. Same goes for pretty much any attribute.

The list is pretty much infinite because the polling FSW only records the net change of the watch directory unlike the regular FSW. That factor alone is an enormous vote for making the polling FSW distinct from the existing FSW.

That said, we should still attempt to unify the usage API between the two as much as possible to:

Minimize cost of entry to people currently using the regular FSW that want to use the polling one
Enable test reuse between the two implementations
Reduce the amount of new API that we have to add
Make it easier to use both FSWs in unison (I've seen/heard lots of asks for this)

API

IMO We should have a shared event handling system as well as a shared filtering and notification system. This means that either FSW can be used to watch for WatcherChangeTypes.Changed with fsw.NotifyFilters = NotifyFilters.FileName.

This is what I'm thinking:

    public partial class *Watcher
    {
        // Shared
        public *Watcher() { }
        public *Watcher(string path) { }
        public *Watcher(string path, string filter) { }
        public bool EnableRaisingEvents { get { return default(bool); } set { } }
        public string Filter { get { return default(string); } set { } }
        public bool IncludeSubdirectories { get { return default(bool); } set { } }
        public System.IO.NotifyFilters NotifyFilter { get { return default(System.IO.NotifyFilters); } set { } }
        public string Path { get { return default(string); } set { } }
        public event System.IO.FileSystemEventHandler Changed { add { } remove { } }
        public event System.IO.FileSystemEventHandler Created { add { } remove { } }
        public event System.IO.FileSystemEventHandler Deleted { add { } remove { } }

        // Only in FileSystemWatcher
        public event System.IO.ErrorEventHandler Error { add { } remove { } }
        public event System.IO.RenamedEventHandler Renamed { add { } remove { } }
        protected void OnChanged(System.IO.FileSystemEventArgs e) { }
        protected void OnCreated(System.IO.FileSystemEventArgs e) { }
        protected void OnDeleted(System.IO.FileSystemEventArgs e) { }
        protected void OnError(System.IO.ErrorEventArgs e) { }
        protected void OnRenamed(System.IO.RenamedEventArgs e) { }
        public int InternalBufferSize { get { return default(int); } set { } }
        public System.IO.WaitForChangedResult WaitForChanged(System.IO.WatcherChangeTypes changeType) { return default(System.IO.WaitForChangedResult); }
        public System.IO.WaitForChangedResult WaitForChanged(System.IO.WatcherChangeTypes changeType, int timeout) { return default(System.IO.WaitForChangedResult); }

        // Only in PollingWatcher
        public PollingWatcher(int pollingInterval) { }
        public PollingWatcher(string path, int pollingInterval) { }
        public PollingWatcher(string path, string filter, int pollingInterval) { }
    }

Note that I left out the Renamed event. That's because distinguishing between a rename and a create&delete is rather difficult when we only check state at an interval and don't know for sure what happened during that interval. It's not impossible, but it might require that we store more information than we want to or that we do some guessing. Something to keep in mind, at least.

Also, allowing a Security NotifyFilter likely will be a large chunk of extra work. Probably not worth it at first.

Implementation

The base implementation is copied from corefxlab. It uses a timer with a callback function that walks the entire watch tree and calculates changes by comparing current state to previously stored state (via a custom hashtable). The hashtable maps directory name and file name to a FileState object. Objects in the tree without table entries are treated as new files. Entries in the table that aren't found in the tree are treated as deletions.

The Windows-specific part uses the win32 functions FindFirstFile, FindNextFile, and FindClose to iterate directory entries. The Unix-specific implementation is similar to file enumeration in System.IO.FileSystem. It uses OpenDir, ReadDir, and stat.

What do we keep track of?

Ideally we would only keep track of the data that the NotifyFilters care about. The FileState object would therefore be variably sized based on the chosen NotifyFilters so we wouldn't have to store any unnecessary data.

Example 1:

watcher.NotifyFilters = NotifyFilters.FileName | NotifyFilters.Size;

would make a FileState look like this: [ string FileName; string Directory; bool isDir; long FileSize]

Example 2:

watcher.NotifyFilters = NotifyFilters.FileName | NotifyFilters.LastWrite | NotifyFilters.LastAccess

would make a FileState look like this: [ string FileName; string Directory; bool isDir; long LastWrite; long LastAccess]

Thoughts:

We could probably use System.Buffers here.
Changing the NotifyFilters would require a complete reread of the tree.
A FileChange object would have to be dynamically allocated.

Platform differences

Functionality will be closely similar between platforms. NotifyFilters are the most difficult thing to get similar behavior for:

    NotifyFilter  | Unix stat item to watch | win32 FIND_DATA item to watch
    ------------- | --------------------------------- | --------------------------------------
    Attributes    | Interop.Sys.FileStatus.CTime |  WIN32_FIND_DATA.dwFileAttributes
    CreationTime  | Interop.Sys.FileStatus.BirthTime |  WIN32_FIND_DATA.ftCreationTime
    DirectoryName | readdir |  FindNextFile
    FileName      | readdir |  FindNextFile
    LastAccess    | Interop.Sys.FileStatus.ATime |  WIN32_FIND_DATA.ftLastAccessTime
    LastWrite     | Interop.Sys.FileStatus.MTime |  WIN32_FIND_DATA.ftLastWriteTime
    Security      | n\a |  n\a
    Size          | Interop.Sys.FileStatus.Size |  WIN32_FIND_DATA.nFileSizeHigh/nFileSizeLow

The least applicable correlation above would be Unix CTime which will change in a bunch of scenarios that aren't attribute changes.

Main contention points

A file by any other name would be a creation

What do we do for renames? Do we treat a rename as a creation&deletion or do we try to be clever and check for similarities between files and guess if they're the same file moved to a different place? How would that work if the file was also changed?

My vote: treat them as a creation&deletion

File change is a hoax

To what extent do we attempt to determine if a change has occurred? Are the underlying functions (stat and FindNextFile) adequate and trustworthy enough, or do we need to do something special (like store an entire files contents) to detect a change?

My vote: rely only upon the underlying APIs info.

To be or not to be like FileSystemWatcher

How much should we really by mirroring the API of FileSystemWatcher? Do we really want to constrain ourselves to its NotifyFilters and FileSystemEventHandlers? How much trouble is that really saving devs wanting to switch from FSW->POLLFSW?

My vote: Be as much like FileSystemWatcher as possible. Share NotifyFilters and most API.

I've got a rough first draft of the CoreFX port here. I am by no means married to anything in that draft, but merely wanted to ensure its feasibility before opening up further discussion.

TL;DR: Add a distinct library for System.IO.FileSystem.Watcher.Polling but keep its API as close to System.IO.FileSystem.Watcher as possible.

KrzysztofCwalina commented 8 years ago

Thanks for great write up!

The FSW APIs are less efficient in many scenarios: a) they raise events per change, as opposed to one event for many changes, b) they allocate event args per event, and possibly other such inefficiencies.

And so I would really like to keep the existing corfxlab APIs as the low level API optimized for efficiency. Then on top of the efficient API we could build FSW emulation layer.

Also, I would like to minimize the dependencies from this new polling watcher, especially to the old FSW APIs. I looked at your project.json file and the set of dependencies is much larger than the corfxlab prototype.

I agree about the rename being exposed as deletion/creation. Unless we can get the mapping to rename to be 100 reliable, it just causes more problems than it solves.

jonmill commented 8 years ago

The Windows-specific part uses the win32 functions FindFirstFile, FindNextFile, and FindClose to iterate directory entries. The Unix-specific implementation is similar to file enumeration in System.IO.FileSystem. It uses OpenDir, ReadDir, and stat.

The performance implications of this are huge; walking large directory structures is a non-trivial task that can take many, many minutes.

watcher.NotifyFilters = NotifyFilters.FileName | NotifyFilters.LastWrite | NotifyFilters.LastAccess

Some applications do not change the Last Write or Last Access times, meaning we will miss the file change. Just something to be aware of

My vote: treat them as a creation&deletion

This has huge implications for customers; consumers of File Watching usually use rename events to simply update bookkeeping information. However, creation and deletion events can be expensive since this can cause tons of work to happen (think hashing a several GB file since it has been determined that the file is 'new'). If we make a FSW it MUST have rename support (IMO).

Are the underlying functions (stat and FindNextFile) adequate and trustworthy enough, or do we need to do something special (like store an entire files contents) to detect a change?

This is getting into the difficult area of file watching; every application handles things differently. Last Write and Last Access time (and even size, in some cases) can all be controlled by the application. In order to really tell if a file contents has changed is to crack the file and hash the contents when we THINK the file has changed.

We had to deal with a TON of weird File System oddities over in OneDrive; watching for file and directory changes is a very tricky business and very difficult to get right. There is significant overhead (wall clock time, CPU time, memory) with doing this correctly; there is significant overhead for consumers of this API if we do not do it correctly (extra work, incorrect notifications leading to data loss, etc). I'd suggest sitting down to have a discussion about this at some point. Just my 2 cents :)

karelz commented 8 years ago

Next steps: We need implementation proposal which validates the API surface proposed above.

darxis commented 7 years ago

Another benefit of polling is that it is not dependent on disk caching policies in the OS. I came across this problem when using the default FSW to watch a log file on Windows 10 platform. I wanted to watch a log file for new log entries to read and process them "real-time".

FSW on Windows depends on WinAPI ReadDirectoryChangesW function. From the MSDN docs

The operating system detects a change to the last write-time only when the file is written to the disk. The operating system detects a change in file size only when the file is written to the disk.

https://social.msdn.microsoft.com/Forums/vstudio/en-US/04cbc049-b029-401f-ad3f-3ff5a2e29dce/filesystemwatcherchanged-behavior-from-2003-r2-to-2008-r2-clients-also?forum=netfxbcl

Appareantly since Windows Server 2008 and Windows 7, disk caching policies were changed and are more aggressive, writing to disk more rarely.

So the Change event is not firing at all until the file is written to the disk. You could do a manual read on the file content or attributes, because the current Windows implementation flushes the file to the disk in this case (ex. refresh directory content in explorer.exe cause it to read the file length, that flushes the buffer cache to the disk, or manually do in a loop var len = new FileInfo(filePath).Length).

UnxUtils (https://sourceforge.net/projects/unxutils/?source=navbar) is also a good place to have a look at how polling could be implemented. It has the source of the Linux tail tool ported to Win32 written in C.

AnthonyMastrean commented 6 years ago

Our team is running ASP.NET Core services in Linux Docker containers, but we're developing on Windows (via Docker for Windows). We just started exploring reloadable IConfiguration sources, for binding a "global" config file from the host into the containers, and were very stumped by the reload behavior (it wasn't reloading until we restarted the services).

Then we stumbled on this issue! It kinda makes sense, SMB and filesystem events has been a long running issue, even outside of .NET Core.

However, one of the outcomes of this current state is that we'll never be able to run ASP.NET Core services with reloadable configuration sources in Linux containers on a Windows server, right?!

HamedFathi commented 5 years ago

Any progress on this ?!?!

karelz commented 5 years ago

@HamedFathi as you can see from the history, there is no progress ...

milos12345 commented 5 years ago

How about using NTFS USN Journal for this? It won't work for network monitoring, but it seems like a more reliable way for local disks and much better performance than walking the directory tree

lloydjatkinson commented 2 years ago

Now that .NET 6 is out, will there be any more time allocated to this? It's quite badly needed.

dotnet / runtime

FileSystem Watcher: Consider polling API #17111