e-dant / watcher

Filesystem watcher. Works anywhere. Simple, efficient and friendly.
MIT License
633 stars 32 forks source link

Windows - Loading a file gets counted as a "modify" #43

Closed iwubcode closed 5 days ago

iwubcode commented 2 weeks ago

First of all, thank you for this library!

I am seeing something odd that I didn't expect. When watching a directory with a bunch of files, when a load event is occurring on any of those files, a "modify" seems to be sent down in Windows 11. This is surprising, as the file itself isn't modified (though I do correctly see modifies when one occurs!).

Upon further reading online, it seems like the value FILE_NOTIFY_CHANGE_LAST_ACCESS is changing which is being treated as a modify. Is there anything that can be done? I will see if I can reproduce in a small test to confirm.

iwubcode commented 2 weeks ago

It's particularly bad in my case because I'm trying to use the modify event to know when to reload a file. If the last access time were to continue to get changed, I could be in an infinite loop. I'm not seeing that and I'm not sure why yet. I admit that watcher is really correct here but not sure if more information could be provided so I could ignore these events?

e-dant commented 2 weeks ago

That's a tough one.

You're right that our implementation is responding to FILE_ACTION_MODIFIED when "listening" for a bunch of different kinds of events.

One option is to provide a way for the user to configure which "event sources" we care about.

That way, people who don't want access changes to be reported (or, confusingly reported, or reported in a mysterious context) can be better served by the watcher.

What would the API look like? We could pass flags down during the watcher's construction, that's fine and natural, but the flags don't necessarily commute meaning across operating systems.

Maybe we could ifdef out various flags such that, say, watch(patch, cb, watcher::ignore_events_from_file_access or something) doesn't compile on anything non-windows. I try to avoid different APIs for different platforms, but is this a good exception?

What should the defaults be? On one hand, this is definitely not an intuitive event. On the other hand, it's just what the operating system is telling us.

e-dant commented 2 weeks ago

For the short term, I encourage you to hack on this repo and make changes as needed. Either for your particular application or to push changes back.

iwubcode commented 2 weeks ago

Thanks for your support @e-dant !

I agree, tricky problem to handle generically through the interface. Outside of platform specific flags, you could maybe add a general thing to ignore "attributes" but that's also somewhat strange. I also wondered about a platform specific pragma to disable this at compile time?

I guess the reason I raised this issue was to question if this behavior made sense for a "modify". For what it's worth, other than atom/watcher from your comparison projects, the other 3 cpp projects do not watch FILE_NOTIFY_CHANGE_LAST_ACCESS. Not sure how many of them are intentional. There was another issue on this raised here: https://github.com/Axosoft/nsfw/issues/121 . And as called out in that issue, C# also doesn't listen for it by default. Is there anything analogous for the other platforms where it could be its own event?

As for my approach. I see I can simply remove FILE_NOTIFY_CHANGE_LAST_ACCESS, or possibly work around this by caching the timestamp of the load and ignoring modify events near it. I'll see what I can do.

I'll leave this open for now, let me know if there's any way you'd like to move forward. Thanks again for your help and this library!

e-dant commented 2 weeks ago

or possibly work around this by caching the timestamp of the load and ignoring modify events near it

FWIW, when I use the watcher to respond to modification events, I usually debounce the events over a second or two.

Various applications will make a bunch of write calls and do some probably unnecessary filesystem ops.

While it's a real issue that the modification event here is either not correct or misleading, debouncing to deal with too-frequent events is also helpful.

I'll look into it. Thanks for using the library.

e-dant commented 2 weeks ago

I'm strongly considering just not reporting "access events" at all.

It would be perfect if we could disambiguate changes from accessing a file from changes to modifying the contents of a file (or directory).

That might be as simple as just spinning up two watchers, one for all events but access, another for the access events. But how do we tell the user about it? Probably a new event type to avoid confusion.

There are ways to get these kinds of access-as-modify events on macOS/ios/linux/inotify/fanotify (and the experimental ebpf watcher). For inotify, I believe the flag IN_ACCESS will do this, and for fanotify, FAN_ATTRIB might give us these events. MacOS watches for them currently.

However, it seems simplest to ignore access-as-modify events on all platforms and wire up a much more complete solution if issues crop up. The utility is questionable.

iwubcode commented 1 week ago

However, it seems simplest to ignore access-as-modify events on all platforms and wire up a much more complete solution if issues crop up. The utility is questionable.

I took this comment and decided to run with it :) . I pushed up a PR. Hoping to get my own software released with these changes (and custom forks are a pain in our workflow). If this needs more consideration, feel free to close. Thank you again!

e-dant commented 1 week ago

Thank you for your help