hyperledger / firefly-common

Apache License 2.0
3 stars 12 forks source link

fswatcher Should Dedup Writes or Detect "CloseWrites" #156

Open onelapahead opened 1 week ago

onelapahead commented 1 week ago

Following up from #149 - it's really a bit of rabbit hole if you look at https://github.com/fsnotify/fsnotify/issues/635 and the numerous other related issues.

In short, the comment for the fsnotify.Write event describes the issue:

    // The pathname was written to; this does *not* mean the write has finished,
    // and a write can be followed by more writes.

So what I've observed when using os.WriteFile on Linux systems + K8s PVCs is that initially fswatcher processes a fsnotify.Write event where the file was rewritten / has size of 0 bytes. It then receives a subsequent fsnotify.Write and the file is finished being written with a non-zero size. The problem is both the 0 bytes and the non-zero bytes events change the tracked hash, causing two unnecessary onChange events in the case where the file did not actually change, and one extra onChange event when the file did change. I've had to add a defense in my use of this utility to ignore onChange if the size of the file is 0 for now (which still isn't great because of the second onChange). For example from my logs:

Config file change detected. Event=WRITE Name=/data/peers.json Size=0 Hash=e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
Config file change detected. Event=WRITE Name=/data/peers.json Size=3687 Hash=2926b16fe9c31543f05e269eff2bd7d6542050923c348b1e13c6110b3c667e71

There's examples of how we could go to the trouble deduping the writes: https://github.com/fsnotify/fsnotify/blob/main/cmd/fsnotify/dedup.go But that feels a bit convoluted if we can avoid it (sorry Windows).

Instead it looks like the maintainer is actively working on support non-portable events for Linux systems like CLOSE_WRITE: https://github.com/fsnotify/fsnotify/blob/main/fsnotify.go#L211-L219

These features are still in main of the lib / unreleased, but once they are we can detect if such events are supported and change what events we listen for as a result - make Linux optimized for detecting file changes which is the main OS targetted for production use cases of this library since it is used primarily in containerized applications.

This issue to track the fsnotify upstream for the more efficient event types and fix this once they are released. If too much time passes, we can consider maintaining an fsnotify fork.

onelapahead commented 1 week ago

Looks like the maintainer will release it very soon 🙏🏻

https://github.com/fsnotify/fsnotify/pull/651

EnriqueL8 commented 1 week ago

Awesome, thanks for the detail here @onelapahead !