gulpjs / glob-stream

Readable streamx interface over anymatch.
MIT License
178 stars 53 forks source link

Using `src` globs can be really slow with negation #24

Closed necolas closed 8 years ago

necolas commented 10 years ago

Trying to use negation in globs can be extremely slow if there are many files in the negated paths (e.g., node_modules). For example:

[ '**/*.js', '!node_modules/**' ]

We're still in the initial phases of a project and while that pattern works, it was causing a lint task to take more than 4s to finish. Changing the pattern to only use positives (but that means manually adding half a dozen paths and files) brought the task time down to 200ms.

UltCombo commented 9 years ago

When faced with the same issue, I've moved all source files to a src subdirectory (sibling of node_modules), then you can just use a simple src/**/*.js glob.

phated commented 9 years ago

Don't glob things under node_modules. Remove the **

yocontra commented 9 years ago

Negation is done post-read right now which is kind of crappy. the node-glob module which we use to find results does not accept multiple globs, so there is no way for us to negate prior to that. If we switch to a new glob module or that module adds support for negation it will speed things up a ton. Really, this is probably one of the biggest bottlenecks in gulp.

UltCombo commented 9 years ago

@contra should we still support regex (post-read) filtering after implementing node-glob's ignore option?

yocontra commented 9 years ago

@UltCombo Yeah I think so

UltCombo commented 9 years ago

Alright then, I'll have a PR in 30~45mins.

yocontra commented 9 years ago

@UltCombo Any update on this?

UltCombo commented 9 years ago

@contra https://github.com/wearefractal/glob-stream/pull/40#issuecomment-81765600

thirdcreed commented 8 years ago

nearly have a working PR for this, there's one test broken: 'should return a file name stream with dotfiles negated' and it's timing out at 2000ms.

Any thoughts on what that might be, for some reason the stream is not emitting anything. https://github.com/thirdcreed/glob-stream/blob/master/index.js

thirdcreed commented 8 years ago

Even when I set the timeout for that test to 15 seconds, it doesn't emit. It runs in like <1 sec when using glob.sync. So I know something's up.

node -p 'require("glob").sync("/home/thirdcreed/Projects/glob-stream/test/fixtures/*swag",{ignore:["/home/thirdcreed/Projects/glob-stream/test/fixtures/**"],dot:true,cwd: "/home/thirdcreed/Projects/glob-stream/test",cwdbase:false,nonull:false})'

That returns [ ], like it should. The ignore actually works, whereas random text in the ignore would allow .swag in, the above command does not; those are the exact parameters being sent into glob.Glob();

The ignore code is 8 days old, so it could be a bug in there, I'm starting to dive in, but any kind of expertise here would be much appreciated. Thanks!

luengnat commented 8 years ago

any update to this issue? I encounter a similar problem without negation though.

yocontra commented 8 years ago

I think this is the same as another ticket - the \ is being expanded by recursing the fs (this is how globs work). Using !node_modules instead of !node_modules/** should fix it.