rjeczalik / notify

File system event notification library on steroids.
MIT License
902 stars 128 forks source link

notify: make it possible to stop a watch given by a path #138

Open veqryn opened 6 years ago

veqryn commented 6 years ago

So, I am now using notify to watch a series of changing directories.
The directories I want to watch change gradually over time, and sadly I can't just watch all of them because there are probably more than a few million and I would run out of watch descriptors. From what I understand on how notify works, this means I end up having to create one channel for each set of directories (plus recursive sub-directories) that I want to watch. This is because if I am using the same channel for all directories, when I want to stop watching one of the directories, notify will stop watching all of them.

But, I am using the same single consumer for all of these channels, and therefore would like to consume from a single channel rather than start tons of goroutines that thrash around trying to merge all of the channels into one.

So, is there a way to use a single channel for all watches, but be able to stop specific watches without stopping all watches?

How I am currently using notify to start a watch on a single directory and subdirectories:

watchChan := make(chan notify.EventInfo, 32000)
err := notify.Watch(watchPath, watchChan, notify.InCloseWrite)

Then when I want to stop watching that directory:

notify.Stop(watchChan)
rjeczalik commented 6 years ago

Sure, create a chan per a directory to watch and fan in all the events to the single channel.

veqryn commented 6 years ago

Yes, that is what I'm currently doing. But it doesn't seem very efficient... Out of all the directories being watched, at any given moment one of them is seeing tons of files being written in it. Therefore, I need to set a very large capacity on every single channel, rather than just having a single channel with a large capacity. Then there is all the goroutines required to fan all of the channels into a single channel...

rjeczalik commented 6 years ago

Then there is all the goroutines required to fan all of the channels into a single channel...

The implementation on the library side would be roughly the same, since each watch needs to be able to be stopped separately. So notify would have a channel per directory. The real question is whether the library should provide that feature out of the box.

As you noticed this use-case requires a costly implementation, so my take is to leave this up to the users to implement, instead of making everyone pay the cost.

Regarding buffered events we had this discussion before and we agreed that on some level it would be good addition to notify - to make it possible to use unbuffered channels, so there's no need to guess the buffer size. Especially on large fs operations. For now we took the os/signal approach with async sending, so we don't need to debug things when everything stales on a blocked reader. If we are there to discuss API extensions, the unbuffered channels will surely be reconsidered.

Partial workaround we used some time ago was QueuedWatch.

veqryn commented 6 years ago

What about having the library allow you to stop watching based on the directory string, instead of based on a channel? Or in case people want to have multiple watches on the same directory, but with different channels, how about directory-string + channel?

Or, provide some other token besides the channel, as a means to stop watching (such as a UUID)?

rjeczalik commented 6 years ago

What about having the library allow you to stop watching based on the directory string

That perfectly makes sense. Adding to v2 api todo-list.

veqryn commented 6 years ago

Personally, I think it might be the most flexible to have the watch command return a UUID or other unique string/id, which is then used to stop the watch:

id, err := notify.Watch(watchPath, watchChan, notify.InCloseWrite)
//...
notify.Stop(id)
rjeczalik commented 6 years ago

@veqryn I'm a bit skeptic about using uuid, it would mean maintaining yet another watcher mapping, where two mappings already exist - channel to watcher and path to watcher. I'd need to think about it a bit more.

veqryn commented 6 years ago

If the v2 api is a breaking change, then you could switch to only one mapping: the uuid

rjeczalik commented 6 years ago

Ideally we'd want to make it backward-compatible. Besides path to watcher mapping is going to stay, since the implementation uses tree-based directory lookup.

veqryn commented 6 years ago

k. want me to close this out?

rjeczalik commented 6 years ago

It can stay opened, so we can keep track of this. Thanks! 😄