guard / listen

The Listen gem listens to file modifications and notifies you about the changes.
https://rubygems.org/gems/listen
MIT License
1.92k stars 246 forks source link

Directory scan sometimes crashes #354

Closed unjello closed 8 years ago

unjello commented 8 years ago

It looks like once in a while instead of filename as a key, and options as values sometimes values hold not a Hash but Float or Fixnum or String. I have temporarily added values.is_a? Hash if to the each block and it works, but didn't have time to root cause it why the values get passed like that.

Crash stack:

E, [2015-11-12T23:32:00.713023 #62036] ERROR -- : fsevent: running worker failed: undefined method key?' for 1447367514.0:Float: /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/record.rb:59:inblock in dir_entries' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/record.rb:54:in each' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/record.rb:54:indir_entries' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/directory.rb:10:in scan' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/change.rb:56:ininvalidate' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/directory.rb:62:in _change' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/directory.rb:27:inblock in scan' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/2.2.0/set.rb:283:in each_key' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/2.2.0/set.rb:283:ineach' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/directory.rb:24:in scan' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/change.rb:56:ininvalidate' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/adapter/base.rb:107:in _queue_change' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/adapter/darwin.rb:45:inblock in _process_event' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/adapter/darwin.rb:40:in each' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/adapter/darwin.rb:40:in_process_event' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/adapter/base.rb:42:in block (2 levels) in configure' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/rb-fsevent-0.9.6/lib/rb-fsevent/fsevent.rb:45:incall' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/rb-fsevent-0.9.6/lib/rb-fsevent/fsevent.rb:45:in run' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/adapter/darwin.rb:51:in_run_worker' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/adapter/darwin.rb:35:in _run' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/adapter/base.rb:78:inblock in start' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/internals/thread_pool.rb:6:in call' /Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/internals/thread_pool.rb:6:inblock in add'

e2 commented 8 years ago

Looks like a bad bug where the record fields are incorrectly used. It would be great if I could reproduce it.

I'll see how soon I get to this.

unjello commented 8 years ago

I get it all the time, about 0.2 chance, but I use it quite often so it's about 5min to reproduce :) Both OSX and Win32+Cygwin. I can help track it, although I must admit the code is a bit hard to parse at the first sight - a lot of fsevent magic I'm not familiar with, or I can just send you some more LISTEN_GEM_DEBUGGING=2 logs.

unjello commented 8 years ago

Hey, can't upload full dumps right now - the easiest way for me to reproduce it is on some proprietary stuff and it'd take a while to strip it. Sorry in advance. I can probably spend some time trying to isolate a testing scenario, but here it is what I've found so far. Maybe you'll come up with a rootcause form the top of your head :)

So the issue seems to reproduce when you create and delete files. It does not happen every time you do that, but from 3 occurences I've checked it's always that. You create a file X, event fires up, you delete the file X and create file Y - event fires up again. On the second event what happens is:

File X is present in previous.inspect with uncleaned :md5 property. No other file has it:

"X"=>{:mtime=>1447407967.0, :mode=>33188, :md5=>"\x9D\\-\xA5|\x05\x19\xEE\x9D\x9DH\f?r|x"}}

Then, in the processing loop this file is somehow misidentified as directory, and that's when it crashes:

D, [2015-11-13T17:28:38.073847 #71691] DEBUG -- : unknown: dir:/Users/angelo/test/X ({:recursive=>true})
W, [2015-11-13T17:28:38.074029 #71691]  WARN -- : scan DIED: undefined method `key?' for 1447407967.0:Float:/Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/record.rb:55:in `block in dir_entries'
/Users/angelo/.rbenv/versions/2.2.1/lib/ruby/gems/2.2.0/gems/listen-3.0.4/lib/listen/record.rb:53:in `each'

it also seems it wants treat each property of file X as a file, thats why you get Float (:mtime), Fixnum (:mode) and String (:md5) in the dir_entries later on. So the whole issue seems to be that somewhere inside your gem wants to treat deleted file as a directory.

e2 commented 8 years ago

That's a hint, thanks.

As for deleting, since there's no "guaranteed" consistency with the filesystem (on OSX), I can't assume something is a file or directory. So there are going to be some heuristics at best, anyway.

But of course it shouldn't crash - and the deleting suggests that entries aren't properly removed from the internal Record structures.

A few things I'm interested:

  1. What's the use case for deleting files? (It's actually uncommon for files to be deleted in projects out there)
  2. Are you watching more than one directory? Are there any relationships between directories? (There shouldn't be any, ideally).
unjello commented 8 years ago

Ad 1. Well, I don't really use Guard but rather treat Listen as os-independent wrapper on file event APIs for prototyping. So it's not your typical dev/build project where indeed you'd most probably just add files.

Ad 2. Yeah, that's just one directory.

unjello commented 8 years ago

ok. So here it is. Not sure why some deleted files go through, but when it fails, it fails because File::lstat throws Errno::ENOENT with No such file or directory @ rb_file_s_lstat. Which kind of make sense - I'm surprised it only fails every now and then. And because you treat exception as a directory by default... (directory.rb):

    def self.detect_type(full_path)
      # TODO: should probably check record first
      stat = ::File.lstat(full_path.to_s)
      stat.directory? ? :dir : :file
    rescue Errno::ENOENT
      # TODO: ok, it should really check the record here
      # report as dir for scanning
      :dir
    end

Now, you have couple of TODOs. I'm wondering which record are you referring to? And why you chose to treat exception as :dir? I'd actually treat them as files. Directories should never fail, right? Especially if you were to scan them further?

e2 commented 8 years ago

Overall, at some point I wanted to rewrite all that processing code and support "invalidations". Meaning, if something changed on the filesystem, an "invalidation" would be generated, and based on the type of "invalidation", parts of the filesystem would be rescanned.

This would make things more "linear" and less complex. (Because turning events into "invalidations" would be one layer, while turning "invalidations" in to scanning/lstat would be another - which would allow easier coding around OS/adapter specific issues).

The problem I had was: I don't own/use a Mac (I'm Linux-only), so it's a bit risky for me to "fool around" with code I can't properly test.

The idea here for returning :dir instead of :file was likely to properly handle moving trees.

More importantly, the Record should store the :dir and :file types, but it doesn't. (There was another bug related to redundant recursion because of this). So instead of checking for a hash, checking for the mtime field should be used instead, like here:

https://github.com/guard/listen/blob/master/lib/listen/record.rb#L55

But there should be a test for this anyway (So I can work it out and correctly refactor later).

Also, the Record was optimized for both speed and lower memory usage (and to avoid mutexes). This is unfortunate, because it only made things complex. (But it's still better than before).

So if it helps, future work will be:

  1. mark dirs separately in Record
  2. introduce "invalidations" in adapters (instead of the type, dir, change parameters everywhere)
  3. refactor the Record and invalidations into a "journaled backup API" (so events can be gathered in on thread, and scanning can happen based on the queue of invalidations elsewhere)

This would almost guarantee consistency between what's on the filesystem and what Listen knows about it. And it wouldn't require such a mess as you are experiencing.

Let me know if I can help. If you can just add a PR where the :mtime field is check (like above), I can pull that in.

e2 commented 8 years ago

The crashing should be fixed by: https://github.com/guard/listen/pull/358

unjello commented 8 years ago

Looks like it works on OSX. Didn't look at the debugging info, but it's not crashing anymore. Thanks! Will check on Windows+Cygwin tomorrow.

e2 commented 8 years ago

Thanks for doing all the work and directing me down the right path!

Much appreciated.

unjello commented 8 years ago

Hey. Glad I could help, and of course thanks for fixing it so fast :) btw. I checked that on Win64 and Win64+Cygwin and it works too. Cool. I guess you can mark it as fixed.

e2 commented 8 years ago

Released as 3.0.5. Thanks again!