rjeczalik / notify

File system event notification library on steroids.
MIT License
906 stars 128 forks source link

watcher_fsevents: Fails to remove watch when directory does no longer exist #124

Closed imsodin closed 6 years ago

imsodin commented 7 years ago

Scenario:

Watch is set up at a dir, then this dir is removed (by user, so no control on it). Then Stop is called on this path, the dir is recreated and a new watch at dir is set up (with a new channel).

What happens:

Creating the watch fails with error "path is already watched". Debug logs (where btw a timestamp would be very useful) show the following error after calling Stop:

[D] Stop(0xc42006e780) error: lstat /private/tmp/syncthing-bcf527b3/src/github.com/syncthing/syncthing/lib/fswatcher/temporary_test_fswatcher: no such file or directory

The full log of the tests where this error occurred can be found here: https://build2.syncthing.net/viewLog.html?buildId=9782&buildTypeId=Syncthing_BuildMac&tab=buildLog&state=386&_focus=438#_state=386

What I would expect to happen:

As Stop does not report its status (no return value) it must leave a "consistent" state, e.g. remove the watches regardless of whether the path exists. Or Watch shouldn't complain about the path already being watched and just send events on the channel provided with the existing watch (if it is still functional after the dir was removed at an earlier point).

The following code should reproduce this problem, but I couldn't test as I don't have a system using fsevents:

dir := "watchbase"
eiChan := make(chan notify.EventInfo)
os.Mkdir(dir, 0777)
if err := notify.Watch(dir, eiChan, notify.All); err != nil {
    panic(fmt.Sprintln("failed setting up the first watch:", err))
}
os.Remove(dir)
notify.Stop(eiChan)
close(eiChan)
eiChan = make(chan notify.EventInfo)
os.Mkdir(dir, 0777)
if err := notify.Watch(dir, eiChan, notify.All); err != nil {
    panic(fmt.Sprintln("failed setting up the second watch:", err))
}
rjeczalik commented 7 years ago

The following code should reproduce this problem, but I couldn't test as I don't have a system using fsevents

You could send a PR with the test itself, it can be executed on macos by travis. See https://github.com/rjeczalik/notify/blob/master/watcher_fsevents_test.go.

imsodin commented 7 years ago

@rjeczalik Both my issues are fixed now: #125 and #127. I hope you don't have a quota on your CI - it took quite a few trial and errors as I couldn't build/test locally. In the process I found four different issues, so it probably makes sense to also make four separate and clean PRs. I will wait with that until you give your feedback on it.

Also @ppknap name is in many comments on watcher_readdcw.go, so maybe he can have a look at those points as well.

The issues I found/fixed are:

  1. readdcw: Missing lock resulting in a race: https://github.com/rjeczalik/notify/pull/127/files#diff-ac809f36d763355597370ab566b6d71aL352

  2. readdcw: Event information received from GetQueuedCompletionStatus in loop can contain both information about a filesystem event as well as state information (un-/rewatch) if the two events happen almost at the same time. The state information will then not be processed and prevent any further (un)watching of that path. Solution is to process state each time GetQueuedCompletionStatus returns. https://github.com/rjeczalik/notify/pull/127/files#diff-ac809f36d763355597370ab566b6d71aL339

  3. fsevents: Unable to remove watch when the dir does no longer exist, due to redundantly running canonical in watcher code: Remove those calls as already suggested by a comment in the code. https://github.com/rjeczalik/notify/pull/125/files#diff-2c1e10c5e98e1dbe5fed8958fe65ba44L15

  4. tests: Root path of test watcher is not passed through cleanpath, thus can be relative, which results in failing tests without the canocical call mentioned in 2. https://github.com/rjeczalik/notify/pull/125/files#diff-e024a33ec83df2db30f5d1b97fa216a3

Unrelated: The debug logging is not very convenient. #128 adds option to enable debug logging via environment variable and adds timestamps to debug output.

imsodin commented 6 years ago

I forgot to close this in #129.