guard / listen

The Listen gem listens to file modifications and notifies you about the changes.
https://rubygems.org/gems/listen
MIT License
1.92k stars 247 forks source link

Celluloid::IO integration? #159

Closed tarcieri closed 9 years ago

tarcieri commented 10 years ago

I think it'd be interesting if listen could leverage Celluloid::IO. This should be fairly straightforward for any of the adapters that can expose an IO object, although Celluloid::IO doesn't presently have APIs to "asyncify" existing IO objects.

This would eliminate the need to kick off filesystem monitoring in a separate thread, and allow actors themselves to listen for events directly while still being able to process incoming methods/messages.

I'm opening this ticket in hopes we can discuss what Celluloid::IO would need to expose to make this happen.

Also, would there be interest in leveraging filesystem monitoring via nio4r? It could use libev's filesystem monitoring support and provide a cross-platform abstraction for utilizing NIO2's filesystem monitoring support on Java platforms

mikeni commented 10 years ago

+1

thibaudgg commented 10 years ago

That's very interesting, I'll have a look at it next week and will certainly come back with a lot of questions. Thanks @tarcieri!

thibaudgg commented 10 years ago

Ok I got finally some time to have a look at Celluloid::IO and nio4r, looks really great.

If I understand well, Celluloid::IO haven't yet API for files and directories (only tcp/udp socket right?) so nio4r libev files events aren't accessible through Celluloid::IO yet.

Libev seems like a very good fit for Listen (but not sure about how well Windows is supported), replacing all Listen adapters (even the polling one) by nio4r is a very attractive idea for me.

I don't know from where to start and I'll have definitely need your help on the integration, but I'm all ready for it.

What would be the first things to do? @guard/core-team what do you think?

tarcieri commented 10 years ago

@thibaudgg there's two ways you could hook into nio4r:

thibaudgg commented 10 years ago

@tarcieri ok, what do you think of starting natively by just using an abstract nio4r API and if we found later that libev isn't good enough on some filesystem (i.e. Windows) we could still add specific library to be monitored by nio4r.

tarcieri commented 10 years ago

@thibaudgg I can definitely try that! Filesystem monitoring is something I've planned on adding to nio4r for awhile. Now that there's a valid use case maybe I'll give it a try ;)

thibaudgg commented 10 years ago

Awesome, if you want you can take some inspiration of the Listen acceptance spec suite, a lot of cases handled here: https://github.com/guard/listen/blob/master/spec/acceptance/listen_spec.

Feel free to ask if you need some inputs.

tarcieri commented 10 years ago

@thibaudgg I'll probably try to do what I did for the rest of nio4r: use Java NIO.2 as the basis for the API, then implement a C version based on libev which is API compatible. Call it "meeting Java halfway".

Here's the Java filesystem monitoring API if you have opinions on this approach:

http://docs.oracle.com/javase/tutorial/essential/io/notification.html

thibaudgg commented 10 years ago

@tarcieri that sounds like a good approach.

Java filesystem monitoring API seems to only support directories watching, so there some points to check:

But these points should certainly be address directly in Listen.

tarcieri commented 10 years ago

@thibaudgg cool, if you can pick up the slack I can hopefully provide a least common denominator API for directory changes

thibaudgg commented 10 years ago

Hi @tarcieri any advance on that side?

tarcieri commented 10 years ago

Nope, as you can probably tell from my belated reply I've been pretty busy lately.

thibaudgg commented 10 years ago

Ok thanks for the feedback! Take all the time you need :)

e2 commented 10 years ago

I'm currently rewriting listen's adapter functionality for a new proposed Listen 3.x API.

The only thing really necessary (for the new API) from an adapter is that every change notification consists only of:

  1. the given watched directory (so adapters would have to allow watching multiple directories - each with different options, e.g. recursive or not)
  2. a file/dir path relative to the directory listened to (so if an adapter watches multiple directories, it would have to "know" which fs event is for which watched directory)

So if adapters kept their own buffers/queues of changes (ideal for async reading), they would have e.g:

[
['dir1', '.'],   # event created by watching 'dir1'
['dir1', 'file1'],   # event created by watching 'dir1'
['dir1/subdir1', 'subdir2'],    # event created by watching 'dir1/subdir1'
['dir1/subdir1', 'subdir2/file1']    # event created by watching 'dir1/subdir1'
]

This means the only events necessary are "modifications" (e.g. adding a file is "modification" on the directory containing it).

Under the hood, it's fine to monitor just directories, as long as the following are detected:

  1. file/dir renames, removals, additions, and file content changes within watched directory
  2. directory content changes within watched directory (but doesn't really matter what they are)

On the application level abstraction, it's about listening for changes to content within a file or in a directory listing - for technical reasons, this means watching directories for e.g. file timestamp changes, which "means" a file's was modified.

Currently for rb-inotify (with above example), this would mean 2 adapter instances (one listening on 'dir1', the other on 'dir1/subdir' recursively), but 3 watchers (on 'dir1', on 'dir1/subdir1' and 'dir1/subdir2' because of recursion handling within rb-inotify).

So e.g. watching for changes in a Gemfile means a single non-recursive watcher watching the app directory and collecting a ['.', 'Gemfile'] event.

(The only "missing" piece for rb-inotify is matching a relative pathname to the watched directory when recursion is on - but that's trivial, so rb-inotify is a good "reference implementation" example, so the closer other adapters would behave to rb-inotify, the better - regardless of their actual implementation details.)

This makes things complex because of the need to manage multiple instances of adapters, making adapters possibly self-managed Celluloid pools. That's because of the need to support multiple configurations of adapters and their instances. (And so, putting adapters into their own gems starts really making sense).

One "feature" of listen is being able to collect "batches" of changes and passing them with a single callback (e.g. so guard-cucumber can be triggered with multiple files for one command). So polling for changes (e.g. read + timeout) is still necessary at the highest level. (e.g. accumulating new changes from different kinds of adapters while guard-cucumber is running with previous batch of changes).

[In short, Guard relies on tasks to fire immediately after file changes, and yet let changes accumulate in bulk during long running tasks.]

Summary: if a nio4r adapter/wrapper existed, which provided a read+timeout/select interface (for listen to accumulate changes) + be able to watch multiple directories independently (with config for each like optional recursion) + be able to match fs events (paths) relative to given watched directory ...

... that would be so awesome (and on so many levels!).

A Ruby example for Linux (using inotify backend) is all that's necessary to decide on an API.

For now the best I can do is rework the rb-inotify adapter to match the new API and try to get the other adapters working like the rb-inotify one, then extract every (?) adapter into it's own gem (because of the new API interface requirements and lack of overlap between adapter implementations - and FS/adapter specific unit tests).

thibaudgg commented 10 years ago

Sounds like a plan!

antitoxic commented 9 years ago

Having this https://github.com/guard/listen/issues/246 isn't this for closing?

e2 commented 9 years ago

@antitoxic - it's actually unrelated, because #246 is related to just the TCP server failing on Windows (I've updated the title - thanks), while this is about avoiding extra threads in Listen.

tarcieri commented 9 years ago

FWIW, I would like to add some nio4r features to Google Summer of Code. Filesystem monitoring could definitely be one.

Update: added! https://github.com/rubygsoc/rubygsoc/wiki/Ideas-List#filesystem-monitoring-support

e2 commented 9 years ago

@tarcieri - as for Linux, the Zeus project compiles a inotify wrapper to pipe messages through stdout to the main process, so at least this adapter could use Celluloid:IO right now (without much work).

But actually doing so just on Linux isn't practical though - since Listen is basically one huge workaround for a lack of decent file system monitoring on Windows and OSX.

In fact, the biggest pain in the ass is OSX, because I have to use a separate thread for each watched directory - just to be able to map an event to a given watched directory (rb-fsevent workaround).

e2 commented 9 years ago

I'm closing this since Listen 3.x no longer depends on Celluloid (because Listen is practically not a good use case for Celluloid) nor Celluloid:IO (TCP support will be in a separate gem).