jakejs / jake

JavaScript build tool, similar to Make or Rake. Built to work with Node.js.
http://jakejs.com
Apache License 2.0
1.97k stars 190 forks source link

Error: RegExp too big when running WatchTask... #304

Open busticated opened 8 years ago

busticated commented 8 years ago

hey there -

thanks for the awesome tool! i've been using it for a while now and it's been delightfully reliable and helpful. except today. hehe! anyway, i'm trying to get a simple WatchTask working and i keep hitting an odd error:

jake aborted.
SyntaxError: Invalid regular expression:
// huuuuuuge regex here :)
RegExp too big
    at RegExp.test (native)
    at FileList.shouldExclude (/path/to/my/project/node_modules/jake/node_modules/filelist/index.js:217:23)
    at /path/to/my/project/node_modules/jake/node_modules/filelist/index.js:188:24
    at Array.filter (native)
    at _resolveExclude (/path/to/my/project/node_modules/jake/node_modules/filelist/index.js:187:33)
    at FileList.resolve (/path/to/my/project/node_modules/jake/node_modules/filelist/index.js:267:23)
    at self.(anonymous function) [as slice] (/path/to/my/project/node_modules/jake/node_modules/filelist/index.js:103:14)
    at FileList.toArray (/path/to/my/project/node_modules/jake/node_modules/filelist/index.js:277:20)
    at jsHint (/path/to/my/project/Jakefile:33:33)
    at new WatchTask (/path/to/my/project/node_modules/jake/lib/watch_task.js:63:16)
    at api.watchTask (/path/to/my/project/node_modules/jake/lib/api.js:368:12)
    at Object.<anonymous> (/path/to/my/project/Jakefile:30:1)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)

here's my task:

watchTask(['build:css'], function (){
    this.watchFiles.include(['public/css/main.less']);
    console.log(this);
    console.log(this.watchFiles.toArray());
    console.log('this is never output (>_<)  ');
});

the call to .toArray() causes the error - either when run from the task's definition function or when executing the base .watch() command here --> https://github.com/jakejs/jake/blob/master/lib/watch_task.js#L70

if i override the default include and exclude patterns before defining my watch task:

jake.WatchTask.DEFAULT_INCLUDE_FILES = [];
jake.WatchTask.DEFAULT_EXCLUDE_FILES = [];

every thing works great! :-D

jake --version > 8.0.12
node -v > v0.10.40
os-x 10.10.5 (yosemite)

if you want to suggest some approaches to fixing, i'm happy to dig in and / or provide additional debug info.

-matt

mde commented 8 years ago

In this case, it sounds like your huge regex is genuinely too big. Searching around the Web yields various results about issues in both FF and Node for this. The reason overriding the default pattern fixes it in your case might be because it shortens the resulting regex enough for it to work.

The only possible approach I can think of for fixing it would involve breaking things up into multiple regular expressions for each time things are added to the FileList, but that would require some substantial refactoring of the code. I'm happy to help if you're interested in giving this a try.

busticated commented 8 years ago

hmm... i see. fwiw, this is a bit cleaner / clearer and accomplishes the same thing as twiddling the DEFAULT_INCLUDE_FILES and DEFAULT_EXCLUDE_FILES fields:

watchTask('less', ['build:css'], function(){
    this.watchFiles.clearInclusions();
    this.watchFiles.clearExclusions();
    this.watchFiles.include(['public/css/**/*.less']);
});

so maybe there's some value in supporting something like .includeOnly() which would clear / reset before adding... but, yeah, not crazy about it.

conceptually, you could only apply the defaults if the user has NOT set up anything more specific - e.g. after the definition fn call here lib/watch_task.js#L63 - but i'm not seeing an easy way to detect that state beyond simply checking for the presence of the definition fn in the first place (iow, if no definition is provided, use default includes & excludes). and, yeah, that doesn't really solve things - just makes it a bit easier to fix if you bump up against it.

just looking at the giant regex, i see almost all of it is made up of .git and node_modules paths - stuff like:

(\.git\/objects\/71\/2cf7c556fb44b3277d1642fd11de828b043974)|(\.git\/objects\/71\/36641afa468cdb16f3f760adcd9a2a9640f6c6)

and

(node_modules\/bignum\/build\/gyp-mac-tool)|(node_modules\/bignum\/examples)|(node_modules\/bignum\/examples\/gen\.js)|(node_modules\/bignum\/examples\/perfect\.js)|(node_modules\/bignum\/examples\/simple\.js)|(node_modules\/bignum\/index\.js)|(node_modules\/bignum\/node_modules)|(node_modules\/bignum\/node_modules\/nan)

etc, etc.

Is there a way to skip the globbing step and just provide a simple regex for the .git and node_modules cases? Aim being to reduce the above to something like (^\.git\/)|(^node_modules\/). That could very well solve things for the vast majority of uses (since now the generated regex is much, much smaller).

-matt

mde commented 8 years ago

I think the simple regex for those two directories might be a good way to avoid the crazy lengthy final regex. Definitely worth a try. I've added you to the JakeJS org -- please feel free to give it a go. If the end result is good, and passes tests, we can add it to a next release.

busticated commented 8 years ago

sounds good. thanks :)