guard / listen

The Listen gem listens to file modifications and notifies you about the changes.
https://rubygems.org/gems/listen
MIT License
1.92k stars 246 forks source link

`only` flag still passes through to ignores if it doesn't match #337

Closed tdreyno closed 3 years ago

tdreyno commented 9 years ago
    def silenced?(relative_path, type)
      path = relative_path.to_s

      if only_patterns && type == :file
        return true unless only_patterns.any? { |pattern| path =~ pattern }
      end

      ignore_patterns.any? { |pattern| path =~ pattern }
    end

There are a couple issues here, I think. Why check only file? I have node_modules in my ignore and definitely want to stop the recursion at the :dir level.

Second, I feel like the inner part of the if statement should be:

        return !only_patterns.any? { |pattern| path =~ pattern }

That is, silenced if only doest match, but allowed otherwise. Otherwise, it should be called only_but_if_not_then_regular_ignore_testing :)

Trying to resolve some Listen speed issues with the Middleman v4 betas and recursing into node_modules is super slow.

Thanks,

e2 commented 9 years ago

Thanks for opening this - much appreciated.

The behavior maybe somewhat adapter specific (polling vs osx vs linux) - I'm not sure (I'd need to check properly). It shouldn't be, though.

The Silencer should've had an overhaul ages ago - API compatibility somewhat kept it back, so don't hesitate to report problems in this area.

The "return true" was is because ignore patterns were supposed to be more important than "only" patterns. (Feel free to change the code and rerun unit tests - they should be self explanatory if they fail (even if the results are not intuitive).

Basically, "only" means "watch these file types" and with "ignore", this means "only watch these file types unless they match the ignore pattern". (and that's why the type is checked - because "only" didn't make sense for directories - as in the docs/README).

I had a "stacked rules" implementation somewhere, where you could add blacklist/whitelist rules on top of each other - and the silencer would work out what to ignore or not (much easier to manage). The rules were simpler and easier to understand, though compatibility got complex. But that's just me going on a tangent...

If you can, set up a repository where the node_modules is scanned, and I'll take a look - if I can reproduce it on Linux, I'll patch it quickly. (I prefer not to try and reproduce it myself, because I rarely end up actually reproducing the same problem without being given a specific example to talk about).

If not ...

On OSX (I only use Linux), send me the output of logging turned on (LISTEN_GEM_DEBUGGING=2 env variable), and I'll try and work out how it's supposed to be.

Or, if you can get a PR which fixes this for you, that would be cool (even if it breaks tests).

But don't hesitate to help me reproduce this - since parts of Listen are a bit complex and dumbfounding.

node_modules (if ignored properly) definitely shouldn't be even traversed at all, ever. Same with any other directory.

tdreyno commented 9 years ago

I monkey patched a fix into Middleman. Let me put it in code and see if the tests pass.

This is primarily a problem with polling (fixing bugs in Vagrant env), where the hashing/recursing of node_modules is simply too slow. There are valid reasons for watching a node_modules folder and globally ignoring it would break a lot of Middleman installs.

As an aside, I'm seeing some really weird behavior when updating ignore lists. The silencer is pretty weird and the "controller" which is suppose to be sharing references, I think, feels like it's missing something.

Anyways, I'll get you more details soon.

e2 commented 9 years ago

OMG... VMs... Polling...

(This is the point where I'd rant about how the whole world is broken, lol).

You can try disabling hashing by using this environment variable: LISTEN_GEM_DISABLE_HASHING=1

If it helps, let me know (Listen checks the timestamps of files to decide whether to hash or not - long story).

I think a fast SSD drive helps - if that doesn't help, only then should other options be considered (seriously). If you're getting high CPU, that's probably related to hashing or unnecessary traversing (it's best to know which one you're dealing with specifically).

Using Linux definitely helps - OSX and/or VMs is asking for performance problems in large projects (if you're using shared folders in a VM using a host that's on OSX).

I'm just saying that file monitoring on anything other that a real live, bare-metal Linux system can be slow.

As for the ignore lists - the controller is just to make the rules work as documented - mostly for Guard, so that multiple ignore rules are added, etc. This was quite recent (months ago?) and if there are edge cases - let me know, and I'll patch and release ASAP.

I'm not really sure why someone would want to watch node_modules - seems like a bad workflow to me. Ideally, people should watch specific subdirectories within node_modules - and only select directories where files are changing.

There are valid reasons for watching a node_modules folder and globally ignoring it would break a lot of Middleman installs.

Could you give an example of this?

It's "convenient" to be watching the whole project tree all the time for any changes - but it's rare that during development changes are happening in every single folder all the time. It's no problem on Linux/inotify, but Polling and Darwin adapters heavily rely on fine-tuned ignore rules and a very limited set of watched directory paths to work around performance issues.

Listen was extracted from Guard - so the silencer API is restricted by how Guard is set to ignore files. Changing the API would break a ton of projects out there. I'd like to rework the Silencer at some point, though - I keep getting my brain fried every time I take a peek at it.

Also, having a constantly updated log file (or database files) updated in the project may kill performance in some cases.

Using anything but Linux for large web projects is likely connected to performance problems - I'm not surprised, but I can help analyze/work out such issues to find the cause and help suggest/fix something.

If you have any suggestions to make things easier to debug/trace, do let me know.

Oh, and disk encryption can slaughter the performance of a VM.

A few questions:

How many watched files do you have? At startup in debug mode, Listen shows the total time of "record building" - what value is it? How long does it take to scan the node_modules directory after a change? (Time between scanning start and before callback time).

tdreyno commented 9 years ago

So, Middleman creates a Listener in our source folder. File changes there are reflected in the dev environment. Many users put node_modules in there and link directly to that JS in their HTML. It's not the way I'd do it, but it is very common. Not excluding node_modules allows them to npm install during dev mode and be able to immediately see those files.

Node's weird recursive modules thing makes this produce 10s of thousands of files :(

I'm happy telling people to work around these issues if they MUST use a VM. But it is still nice in many circumstances.

e2 commented 9 years ago

Had similar issues with jekyll, where it was jekyll's fault for making the project's root directory the source directory by default.

Horrible setup to begin with, since it often lead to infinite recursion and hours of pointless miscommunication with users who couldn't share their code.

Listen by default ignores the vendor directory - and I'd add node_modules there as well (at some point).

No one ever complained about not watching the vendor directory by default.

I'd strongly suggest exiting with an error (or big warning at least) if you detect node_modules in the source directory - the frustrations are just not worth it.

Even symlinks are a better option (while watching original folders separately).

A properly organized project is worth gold - and warnings otherwise will save users the grief and you'll have more time for adding valuable features.

It's like: if someone uses a custom "app" directory in their rails project, they're going to suffer, no matter how hard the world adapts to them.

Again - symlinks will solve any "custom" needs, because they allow multiple project structures to coexist in one place. On Aug 18, 2015 8:16 PM, "Thomas Reynolds" notifications@github.com wrote:

So, Middleman creates a Listener in our source folder. File changes there are reflected in the dev environment. Many users put node_modules in there and link directly to that JS in their HTML. It's not the way I'd do it, but it is very common. Not excluding node_modules allows them to npm install during dev mode and be able to immediately see those files.

Node's weird recursive modules thing makes this produce 10s of thousands of files :(

I'm happy telling people to work around these issues if they MUST use a VM. But it is still nice in many circumstances.

— Reply to this email directly or view it on GitHub https://github.com/guard/listen/issues/337#issuecomment-132305480.

e2 commented 9 years ago

BTW, isn't it better to use a manifest file watched, so that it installs npm modules automatically? (so there's no need to even npm install anything).

Kind of like guard-bundler does for Gemfiles?

tdreyno commented 9 years ago

npm install jquery --save is for adding new libraries to the manifest during dev-time.

e2 commented 9 years ago

Right. Quite a lot to type though. On Aug 18, 2015 11:04 PM, "Thomas Reynolds" notifications@github.com wrote:

npm install jquery --save is for adding new libraries to the manifest during dev-time.

— Reply to this email directly or view it on GitHub https://github.com/guard/listen/issues/337#issuecomment-132350385.

ColinDKelley commented 3 years ago

I'm closing this since it is so old. Happy to reopen if there is fresh info.