sighjs / sigh

multi-process expressive build system for the web and node.js, built using baconjs observables
209 stars 12 forks source link

EMFILE: too many files open #51

Closed jamesj2 closed 7 years ago

jamesj2 commented 8 years ago

I have a large legacy application with I'm trying to build a dist package using Sigh.js and JSPM. I like the ideas behind Sigh.js and my initial test shows it can handle much faster incremental builds then gulp. But it seems all the files are being opened at once and causes an ! Error: EMFILE: too many open files, open error .

I'm running.

sigh.js

var glob, write;

module.exports = function(pipelines) {
  pipelines['build-php'] = [
    glob('app/**/*.php'),
    select({ projectPath: /^(?!app\/vendor).*$/ }),
    select({ projectPath: /^(?!app\/public\/jspm_packages).*$/ }),
    select({ projectPath: /^(?!app\/public\/min).*$/ }),
    write('dist')
  ];
};

result

> sigh build-php
 ! error: pipeline build-php
 ! Error: EMFILE: too many open files, open '...\acp_database.php'
 ! Error: EMFILE: too many open files, open '...\acp_database.php'
    at Error (native)

After some web searching it seems graceful-fs is a drop in replacement made to handle issues like this.

insidewhy commented 8 years ago

Yeah sigh is much faster for a number of reasons. Need to promote it better or something ;) This error comes from the chokidar package, maybe they switched to graceful-fs in a later version or it's been forked or something.

Sent from my Android device with K-9 Mail. Please excuse my brevity.

insidewhy commented 8 years ago

Ah I can fix it, not sure if I can fix the case when you use -w though. Not sure if fs.watch will be able to handle it or not.

jamesj2 commented 8 years ago

Wow, you're fast too! Are you using Chokidar for watching files as well? I did find a graceful-fs fork in graceful-chokidar.

insidewhy commented 8 years ago

I don't think that fork handles fs.watch. I'll push out a release with graceful-fs switched in for fs in the glob plugin and see what happens from there. It should fix the non--w case at least.

insidewhy commented 8 years ago

You should probably also get rid of all the selects and pass multiple wildcards to glob instead.

insidewhy commented 8 years ago

And having .*$ at the end of a regex is the same as just omitting it.

insidewhy commented 8 years ago

Okay the error comes from node-glob in this case. Damn.

insidewhy commented 8 years ago

Actually now I think it comes from sigh-core. Sorry for all the noise, I shouldn't really work on stuff until I actually have time for it ;) Will definitely be pushing out some kind of fix this evening.

insidewhy commented 8 years ago

I need to modify sigh-core's API a bit, it currently reads files synchronously. This should help speed things up. How are you invoking jspm? I've been working on a sigh-jspm plugin.

jamesj2 commented 8 years ago

Actually I haven't gotten that far yet. I saw there was a sigh-jspm module but didn't realize it was still a work in progress. Below is currently what I'm using in gulp, it's not gulp specific. And thanks for all your hard work!

import jspm from 'jspm';
var builder = new jspm.Builder();
builder.bundle(`${paths.app.entryPoint}`,`${paths.publicDirName}/bundle.js`, {
  minify: true,
  sourceMaps: true,
  node: true
});
insidewhy commented 8 years ago

I was working on a patch for jspm/systemjs to allow files to be passed via the API rather than having to rely on stuff being read from the fs. Another possibility is for the plugin to just write the files to shared memory and have jspm read from there. Or to be super simple, call it after a write().

insidewhy commented 8 years ago

Oh that patch got applied but I'm not sure if it was enough to do what I wanted.

insidewhy commented 8 years ago

Hm it's weird, the only thing this could come from is readFileSync yet the fact it's synchronous in theory means the FD should be opened and closed by the time its finished and block all other FD opens during that time. So I guess node is hanging onto the FDs for a while after the readFileSync is finished... maybe?

insidewhy commented 8 years ago

I'll need to run some tests with readFileSync, if it really does hold onto FDs past returning then I'll need to change sigh-core's API in a way that will break a bunch of plugins.

insidewhy commented 8 years ago

I've run some tests with fs.readFileSync... looping over 100,000 files... and I don't get the EMFILE. So it seems the error does not come from the fs.readFileSync. So where the hell does it come from...

Does the error you pasted above not come with a stack trace? I think in the latest sigh you should get stack traces with all errors, so I'm not sure if you omitted it or where on an earlier sigh.

insidewhy commented 8 years ago

One other thing... using reject would be more readable... you wouldn't have to use a negative-look-ahead assertion regex then. And did you try using multiple glob patterns instead of selecting after the fact?

insidewhy commented 8 years ago

Ah you were using the latest sigh, did the errors not come with a stack trace?

insidewhy commented 7 years ago

@jamesj2 If you can get back to me on my questions then I'll reopen this issue.