dojo / util

Dojo 1 - build utilities. Please submit bugs to https://bugs.dojotoolkit.org/
https://dojotoolkit.org/
Other
60 stars 105 forks source link

fixes #17105, Prevent 60s closure compiler timeout #25

Closed gitgrimbo closed 9 years ago

gitgrimbo commented 9 years ago

https://bugs.dojotoolkit.org/ticket/17105

https://github.com/google/closure-compiler/blob/v20120917/src/com/google/javascript/jscomp/Compiler.java#L677

Compiler.disableThreads() means don't use the ExecutorService to run compilation tasks, which means no 60 second timeout after the final compilation task (due to the ExecutorService waiting for new tasks that never arrive).

disableThreads() is the default behaviour. A build profile has to explicitly enable threads with:

  optimizeOptions: {
    useThreads: true
  }

CLA signed as gitgrimbo.

dylans commented 9 years ago

Well, he is making it a configuration setting rather than always switching to single threaded. I would guess that for simpler builds, this would be much faster, and for more complex builds, you'd probably still want to use multithreading, even with the 60 second delay, but that's just a guess.

wkeese commented 9 years ago

OK, but can you fix it so we can have multithreading without the 60s hang?

dylans commented 9 years ago

Good point... more recent versions of Closure Compiler allow for the setting of the timeout, https://github.com/google/closure-compiler/blob/master/src/com/google/javascript/jscomp/Compiler.java#L674-L676 The version linked to from the patch is from 3 years ago, and didn't have this option exposed as a public API.

gitgrimbo commented 9 years ago

Yeah there's probably a way to stop the 60 seconds timeout and preserve closure's threading, but it's not pretty.

At the end of optimizeRunner.js add this code:

if (jscomp) {
    var compilerClass = java.lang.Class.forName("com.google.javascript.jscomp.Compiler");
    var compilerExecutorField = compilerClass.getDeclaredField("compilerExecutor");
    compilerExecutorField.setAccessible(true);
    var compilerExecutor = compilerExecutorField.get(compilerClass);
    compilerExecutor.shutdown();
}

This will attempt to shutdown the ExecutorService of the closure compiler. Each spawned JVM (that runs optimizeRunner.js) will be shutting down its own ExecutorService. It's not pretty as it's poking around in closure's private parts.

My first test using maxOptimizationProcesses: 1 has Node spawning 1 JVM (as expected), and looking at the JVM with jconsole I see two Threads named "jscompiler" (a good first indication that closure's Java multi-threading is preserved).

The build termination time is sped up by the expected 60 seconds.

If the brittleness of the solution can be ignored (could break if closure changes its internals), it is probably a better solution (for the general case) than the one I first offered.

wkeese commented 9 years ago

OK, clearly com.google.javascript.jscomp.Compiler is missing the proper API to shut itself down. Too bad.

I guess your new solution is a bit better than the original.

I'd also be OK with just disabling threads unconditionally, in the name of simplicity. And also so that starting a build doesn't lock up your whole machine. Are we getting significant performance increase from the multi-threading?

gitgrimbo commented 9 years ago

As far as I can tell, there isn't a simple answer to the performance question. I'm trying to compile (no pun intended) a table of build times while varying the following:

and the numbers aren't straight forward. E.g. increasing maxOptimizationProcesses beyond 1 gives very poor results with Java 8 from what I can see, and 64 bit JVMs seem to give worse results than 32 bit JVMs.

If I get some more time I'll continue collating results. It would be nice to have some automated testing that records the performance of various builds (using closure, uglify1, uglify2, maybe shrinksafe) and shows the time taken and the relative minified file sizes.

The results would be useful as guidance for people, but my gut feeling is that you'd need to experiement for builds fastest for your own machine.

Another interesting performance property of some builds I've tried is the copy transform (copy.js). I'm seeing poor performance on Windows, and it seems to be because the copy is implemented by spawning off to cmd.

args = has("is-windows") ?
    ["cmd", "/c", "copy", fileUtils.normalize(resource.src), fileUtils.normalize(resource.dest), errorMessage, bc, cb] :
    ["cp", resource.src, resource.dest, errorMessage, bc, cb];

I've hacked at this a bit and seem to get much better performance using the readFile/writeFile methods available in util/build/fs. E.g.

    function copyFileWithUtilFs(src, dest, cb) {
        var fs = utilFs;
        if (has("is-windows")) {
            src = fileUtils.normalize(src);
            dest = fileUtils.normalize(dest);
        }
        fs.readFile(src, "utf8", function(err, contents) {
            if (err) {
                cb(err);
            } else {
                fs.writeFile(dest, contents, "utf8", cb);
            }
        });
    }
gitgrimbo commented 9 years ago

Hi, I've added another branch with the 'shutdown' fix. Let me know if this would make for a better pull request.

https://github.com/gitgrimbo/util/tree/fix17105-closure-executor-service-shutdown

wkeese commented 9 years ago

Wait, are you saying that when you set maxOptimizationProcesses > 1 that the builder spawns multiple JVMs, and then each JVM spawns multiple threads? That sounds like overkill.

About the readFile/writeFile thing, that sounds like a good enhancement too. IIRC the copying is the main performance cost (besides this 60s hang you are talking about). I remember Rawld reporting performance numbers from tests that didn't use the copy transform, so the numbers seemed meaningless to me. So my guess is that he didn't think about optimizing the copy.

gitgrimbo commented 9 years ago

Yes, for a build run using Node there will be maxOptimizationProcesses number of JVMs, as sendJob spawns that many java processes. When using jconsole I am yet to see more than 2 jscompiler Threads though (per single JVM). Given the changes to the internals of closure, it might be worth trying a single spawned JVM, and letting that JVM manage its own Threads based on the amount of work sent to it.

And for the file I/O, I guess the quickest I/O is the I/O that's never done. Is there a way of reusing

for iterative builds during development (maybe using timestamps to detect changes)? Perhaps keep an in-memory cache of written files, so that subsequent parts of the build pipeline can 'read' from memory?

Of course you can always clean your output folder to do a full clean build, and a CI server would do this.

wkeese commented 9 years ago

From looking quickly at the builder, it doesn't seem like it's taking advantage of the multiple thread feature at all. It calls:

compiler.compile(externSourceFile, jsSourceFile, options);
writeFile(dest, copyright + built + compiler.toSource() + mapTag, "utf-8");

That looks like compile() is a blocking call. So only one call to compile() at a time? Perhaps I missed something.

Your idea about the incremental file copying might make sense. But I noticed https://bugs.dojotoolkit.org/ticket/17105 mentioned just 22 seconds for a build. If that's typical then it's probably not worth further optimization.

gitgrimbo commented 9 years ago

Yes Compiler.compile() is blocking from what I can gather, and as a consequence of this I don't think the ExecutorService can scale up its Threads unless there are multiple instances of Compiler to push work/tasks to it. And having multiple instances of Compiler would basically mean handling Threads yourself anyway (one Thread per Compiler). If this is the case then I think the spawn approach is probably still the best way to get closure to use more cores.

What I would say here is that the distribution of work amongst these spawned processes is basic round-robin, and so some processes finish their set of work before others. There's not a lot in it, but the 'slowest' process can finish several seconds after the 'quickest'.

For some quick and dirty timings, I built a bare-bones app with the dojo, dijit and dojox packages and these were the times on my i7 PC (4 physical cores, 2 logical cores per physical). The first number is the maxOptimizationProcesses setting (this used optimize=closure, no layer optimisation, Java 1.8.0_45 64-bit).

and for 'fun'

So the sweet spot (for me) was maxOptimizationProcesses=2, and the worst settings (apart from 10) were 8 and the default value of -1 (which in my case would be 8).

Yet the same test with Java 1.8.0_45 32-bit and maxOptimizationProcesses=-1 gave 24.815 seconds! And I consistently see the 32-bit JVMs do better with larger values for MOP, and the 64-bit JVMs do worse.

I think the 60 second saving by explicitly shutting down the ExecutorService is the easy boost to go for here.

The issue linked to says my build takes 100 seconds now compared to 22 seconds before, so assuming 60 seconds of that is due to the timeout, that possibly leaves 18 seconds of additional performance regression to account for (in that particular case).

And with respect to file copies and general disk I/O, my work laptop has fairly aggressive anti-virus settings, and so there is quite a large penalty for disk I/O (some operations more than others, e.g. the cmd /c copy I mentioned earlier). I appreciate this might not be the common case, but I am feeling it acutely at the moment.

wkeese commented 9 years ago

I think the 60 second saving by explicitly shutting down the ExecutorService is the easy boost to go for here.

OK... but it sounds like you can get the same exact effect by just calling disableThreads(). Did I miss something?

So the sweet spot (for me) was maxOptimizationProcesses=2, and the worst settings (apart from 10) were 8 and the default value of -1 (which in my case would be 8).

Thanks for testing. Very strange results. Maybe it's better to set the default maxOptimizationProcesses to 1, so people using the 64-bit java don't get burned. But whatever.

And with respect to file copies and general disk I/O, my work laptop has fairly aggressive anti-virus settings, and so there is quite a large penalty for disk I/O

Ah right, same with IBM laptops. Well, if you want to add in that feature to only copy when needed, that would be great.

What I would say here is that the distribution of work amongst these spawned processes is basic round-robin, and so some processes finish their set of work before others. There's not a lot in it, but the 'slowest' process can finish several seconds after the 'quickest'.

OK, that's a bit disappointing. Also, doesn't splitting the work across multiple compilers preclude using closure's global optimization, where it minifies the names of Object properties in addition to local variables?

gitgrimbo commented 9 years ago

OK... but it sounds like you can get the same exact effect by just calling disableThreads(). Did I miss something?

Yes, the 60 second saving can be achieved using either approach. However, if disableThreads() is used then there may be impacts for Java 6 builds (https://github.com/google/closure-compiler/blob/v20120917/src/com/google/javascript/jscomp/Compiler.java#L186)

  /**
   * Under JRE 1.6, the JS Compiler overflows the stack when running on some
   * large or complex JS code. When threads are available, we run all compile
   * jobs on a separate thread with a larger stack.
   *
   * That way, we don't have to increase the stack size for *every* thread
   * (which is what -Xss does).
   */

For this reason, the shutdown() approach is possibly safer. We'd just have to make a note of what we do and ensure it's still valid if Dojo's version of closure is upgraded.

OK, that's a bit disappointing. Also, doesn't splitting the work across multiple compilers preclude using closure's global optimization, where it minifies the names of Object properties in addition to local variables?

Not sure, but I don't think any of the changes mentioned in this PR (or https://github.com/gitgrimbo/util/tree/fix17105-closure-executor-service-shutdown) would alter the existing behaviour of the Dojo build with respect to closure's global optimization.

wkeese commented 9 years ago

OK, that all makes sense. I posted your other branch as PR #26.