xoofx / zio

A cross-platform abstract/virtual filesystem framework with many built-ins filesystems for .NET
BSD 2-Clause "Simplified" License
817 stars 61 forks source link

IFileSystemWatcher design #6

Closed Rohansi closed 6 years ago

Rohansi commented 6 years ago

I need this functionality and would like to try implementing it. Did you have any additional plans for this @xoofx?

xoofx commented 6 years ago

That's welcome!

I remember that I only made a sketch up only an interface (which I simply copied from FileSystemWatcher)

public interface IFileSystemWatcher : IDisposable
{
    event EventHandler<FileSystemEventArgs> Changed;
    event EventHandler<FileSystemEventArgs> Created;
    event EventHandler<FileSystemEventArgs> Deleted;
    event ErrorEventHandler Error;
    event RenamedEventHandler Renamed;
    IFileSystem FileSystem { get; }
    UPath Path { get; set; }
    NotifyFilters NotifyFilter { get; set; }
    bool EnableRaisingEvents { get; set; }
    string Filter { get; set; }
    bool IncludeSubdirectories { get; set; }
    int InternalBufferSize { get; set; }
    void BeginInit();
    void EndInit();
    WaitForChangedResult WaitForChanged(WatcherChangeTypes changeType, int timeout = -1);
}

and the interface should be returned by the IFileSystem with some methods like

IFileSystemWatcher Watch(UPath path)

Assuming that you would not be able to change the watch path once a Watcher is created.... but I was not settled there, as the original implementation allows you to change the path. So Maybe a method Create() that returns a IFileSystemWatcher might be better. After there is a bit of thinking to make sure that it is not too complicate when you want to support all the types of IFileSystem (aggregates...etc.), so maybe the immutable path is better... (specially because you have to track carefully subscriber with aggregates...etc.)

So I will be happy to assist you there, since I have no time to work on this right now. You can open PR once you have experienced a bit a design and we can follow this from there.

Rohansi commented 6 years ago

Should we replace the types that FileSystemWatcher uses for events too? The names should at least use UPath but what about the enums?

We shouldn't need BeginInit and EndInit because those are related to WinForms. I'm not so sure about InternalBufferSize and WaitForChanged either.

Rohansi commented 6 years ago

What should happen when an aggregate filesystem (or similar) is modified after starting to watch? For example if we change the filesystems it's aggregating after Watch those changes would need to apply to any existing watchers too. What about when it gets disposed? Should that disable any existing watchers?

xoofx commented 6 years ago

Should we replace the types that FileSystemWatcher uses for events too? The names should at least use UPath but what about the enums?

If we only have an enum dependency, better use our own enum, remap the underlying one and avoid a dependency on the public surface API

What should happen when an aggregate filesystem (or similar) is modified after starting to watch? For example if we change the filesystems it's aggregating after Watch those changes would need to apply to any existing watchers too. What about when it gets disposed? Should that disable any existing watchers?

Good questions. If we could replicate changes of a filesystem removed from the source of an aggregate, that would be better. You can iterate a bit on this and see how much it complicates the design. I don't want to introduce too much hurdle with this. So if it is introducing a complex sync machinery between the filesystems, we will have to really decide if all filesystem watchers are source change resilient...etc.

Otherwise, I would not protect against dispose. It is assumed that it is a usage error and we should not define an automatic recovery behavior for this.

Rohansi commented 6 years ago

When generating the Changed event for MemoryFileSystem files on writes what would be the proper way to get the path for that file? I need to go through the directory tree to root to construct the path (which we could cache) but I'm not sure about the proper locking to do this.

I also need to consider what users can do in the event handlers because their actions can take more locks and potentially cause a deadlock. Or I could dispatch it to the threadpool which would avoid that but bring in a dependency.

xoofx commented 6 years ago

When generating the Changed event for MemoryFileSystem files on writes what would be the proper way to get the path for that file? I need to go through the directory tree to root to construct the path (which we could cache) but I'm not sure about the proper locking to do this.

Go through FileSystemNode.Parent... the parent resolution should happen quite deeply (when changing the content...etc.), so the file should be already locked, so the parent can't change actually.

I also need to consider what users can do in the event handlers because their actions can take more locks and potentially cause a deadlock. Or I could dispatch it to the threadpool which would avoid that but bring in a dependency.

Yep, I thought about that when prototyping the idea. I was quite inclined to create a thread per watcher. Indeed the potential locks are very important so we need to avoid them. A change dispatch from an internal of MemoryFileSystem for example should go through a queue sending this to another thread and returning immediately.

If you can come up with something using a threadpool, why not, but I was just worried that it would require a contextual setup that could be annoying to expose (shared pool between multiple fs, you need to pass it...etc.)

Rohansi commented 6 years ago

I meant System.Threading.ThreadPool, sorry about that.

xoofx commented 6 years ago

Yep, you can try with System.Threading.ThreadPool. I'm not always inclined to use a global thread pool when you don't control much the lifespan of the threads, but let's assume that watcher listeners are short (Not sure how this is handled in .NET actually for the real FileSystemWatcher, I know that they are using their own threads but never looked into the details...)

moolicc commented 6 years ago

Why not implement an IFileSystem ('parent') that plays host to another IFileSystem ('child')? Whenever an operation happens on the parent system, it invokes necessary events and calls the same operation on its child.

xoofx commented 6 years ago

Why not implement an IFileSystem ('parent') that plays host to another IFileSystem ('child')? Whenever an operation happens on the parent system, it invokes necessary events and calls the same operation on its child.

Not sure to follow exactly, but this scenario is already implemented in Zio (through all composite filesystems: aggregate, readonly, mount...etc.), in that case "events" (method calls) are pushed/proxied from parent to child. For the watcher, it requires the reverse, events are coming from child to parent (e.g PhysicalFileSystem watcher where events are directly coming from the disk)

moolicc commented 6 years ago

Well, what I propose is something similar to these systems that you already have implemented. Except after the call is pushed to the child, you would raise an event. Like so:

class FileSystemWatcher : IFileSystem
{
    public event Action<UPath> OnDirectoryCreated;
    public IFileSystem Target { get; private set; }

    public void CreateDirectory(UPath path)
    {
        Target.CreateDirectory(path);
        OnDirectoryCreated?.Invoke(path);
    }
}

Probably just an oversimplified idea I had. It would likely be awkward to use.

xoofx commented 6 years ago

Sure, you can develop a pseudo FileSystemWatcher like this, but you would have actually to bring this into the IFileSystem API to make it relevant.

Aggregate file systems have to replicate changes from nested filesystems, so the Watcher is not something you can just put as a proxy, but is filesystem dependent, some filesystems may have events coming from external source....etc.

Typically, a PhysicalFileSystemWatcher cannot subscribe generically to a OnDirectoryCreated because the underlying FS doesn't give you this (across volumes...etc.), so you would have to subscribe to all volumes, listen to all directory changes...etc. That doesn't match what is available through the existing System.IO API. That's why the proposal is to provide a method on IFileSystem similar to this:

IFileSystemWatcher Watch(UPath path)

This makes also the design closer to how System.IO.FileSystemWatcher is actually designed. You subscribe to a path (with also some filtering options) and the watcher is optimized for this scenario.

Rohansi commented 6 years ago

I haven't considered the case where we watch the special directory on PhysicalFileSystem. Is that something we would want to support? So we could watch all volumes by watching / or /mnt/?

I tried out System.Threading.ThreadPool but forgot we should maintain the order of events so I am going to use a dedicated thread (per filesystem, as required) instead.

Back to getting the path for files when changing content. Accessing Parent.Children fails an assert because the directory node isn't locked. This makes sense because it's being triggered on a write to the file. How should I be locking in order to access the parent chain?

xoofx commented 6 years ago

I haven't considered the case where we watch the special directory on PhysicalFileSystem. Is that something we would want to support? So we could watch all volumes by watching / or /mnt/?

Let's not support it for now.

I tried out System.Threading.ThreadPool but forgot we should maintain the order of events so I am going to use a dedicated thread (per filesystem, as required) instead.

indeed

Back to getting the path for files when changing content. Accessing Parent.Children fails an assert because the directory node isn't locked. This makes sense because it's being triggered on a write to the file. How should I be locking in order to access the parent chain?

so, something that was not done is that the FileSystemNode doesn't bring a Name and that should be added so that you can easily rebuild a path by going through only the Parent property which should be safe to use. You will have a couple of place in MemoryFileSystem to update, but that should hopefully cover the use case for the watcher.

xoofx commented 6 years ago

Fixed thanks to PR #9