sq / JSIL

CIL to Javascript Compiler
http://jsil.org/
Other
1.73k stars 242 forks source link

Google Closure Compiler #522

Open Mike-E-angelo opened 10 years ago

Mike-E-angelo commented 10 years ago

Something to consider as part of the packaging process of a potential application written for JSIL is the Google Closure Compiler: https://developers.google.com/closure/compiler/

You can download the latest version here: http://dl.google.com/closure-compiler/compiler-latest.zip

I would like to look into incorporating this into the Corlib build process as part of the "smoke test" mentioned in issue #514, as it actually verifies JavaScript syntax in addition to minifying/obfuscating the code. It turns out that the JavaScript that JSIL emits is very close to being compliant. There were two issues that I discovered (and would be interested in fixing):

1) Arguments. Parameters in methods named "arguments" produce the exception "ERROR - Shadowing "arguments" is not allowed." 2) Some variable setters get emitted as getters. Some examples:

parentTypeCount.get() = ((parentTypeCount.get() + 1) | 0), 
...
count.get() = ((count.get() + 1) | 0), 

and

state.getOffset(8) = $T00().$Cast($T00().op_Addition(state.getOffset(8), num2));
state.getOffset(16) = $T00().$Cast($T00().op_Addition(state.getOffset(16), num3));
state.getOffset(24) = $T00().$Cast($T00().op_Addition(state.getOffset(24), num4));
state.getOffset(32) = $T00().$Cast($T00().op_Addition(state.getOffset(32), num5));
state.getOffset(40) = $T00().$Cast($T00().op_Addition(state.getOffset(40), num6));
state.getOffset(48) = $T00().$Cast($T00().op_Addition(state.getOffset(48), num7));
state.getOffset(56) = $T00().$Cast($T00().op_Addition(state.getOffset(56), num8));

Commenting these out (and renaming "arguments" parameters to "args"), however, allows the compiler to work, and reduces the code anywhere from 25-33%. Here is the command I used from the command prompt to get things working (I ran into out-of-memory and ECMA compliance exceptions) if interested:

java -Xmx1024m -jar compiler.jar "C:\PATH\TO\mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089.js" --js_output_file "C:\PATH\TO\OUTPUT\mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089.js" --language_in=ECMASCRIPT5

If this is something we would like to incorporate into the Corlib build process (and JSIL in general), I can look into fixing these issues and update this thread accordingly.

kg commented 10 years ago

Those .get and .getOffset calls are actually pointer operations, so that's probably a pointer use case that doesn't work correctly. The MS BCL uses pointers a lot, and also uses some really obscure opcodes.

Are you running closure in the mode that renames methods? I don't think that mode will actually produce working JSIL applications. Did they add a mode that's less extreme than that but still minifies? The last time I tried, the only mode they had that wouldn't break JSIL didn't produce much of a size savings.

Thanks for doing the investigation!

Mike-E-angelo commented 10 years ago

OK, that's good to know about the pointers. I will look into seeing if I can just JSIgnore those uses.

I haven't looked much into Closure's options, but I am pretty sure they are the same. Is there another minification process/application that you have thought of or had in mind as far as a packager goes? I do know there is the PNG-based compression which I ultimately hope to land on. Something like this: http://creativejs.com/2012/06/jsexe-javascript-compressor/ (This also uses Closure, as well, however).

In any case, this is not as much as a priority right now, but it's nice to think about in the scheme of things.

kg commented 10 years ago

I suspect most fancy compressors aren't going to be worth the effort; gzip js does a LOT and gzip minified (via closure etc) JS usually gets you the rest of the way. Last I checked gzipping unminified JSIL output was fairly competitive with a gzipped .net assembly. If we shave 100kb off the download but then decompression takes an extra 200ms, we've failed, especially if we're having to use canvas getImageData to do it.

Most minifiers mangle variable, property and function names, which will break JSIL. If you can turn off property renaming, though, everything else should be fine - the problem is that last time I checked, closure's name mangling was all-or-nothing.

Re: pointers I think if you disable unsafe code, any function using pointers will automatically be JSIgnored. I have pointers enabled for corlib in part because I wanted to make sure I wasn't generating completely broken code for their pointer functions (even if, as you've noticed, it's not CORRECT code.)

Mike-E-angelo commented 10 years ago

Yes, agreed on the straight-up gzip... so very impressive.

It appears there are 3 levels now: https://developers.google.com/closure/compiler/docs/compilation_levels

So if I am reading this correctly, SIMPLE_OPTIMIZATIONS is the default and is what is producing the results that I have been seeing of an extra 25-33% additional compression.

It should be noted that the mscorlib that I have compiled was with my own .jsilconfig, which used the default value of disabling unsafe code, so it appears there is additional JSIgnoring to do.

Additionally, there is still the matter of arguments -> args renaming. I can look into this if you would like.

kg commented 10 years ago

The new optimization level sounds like the ticket, glad to hear they added it. I'd be happy with adding support to stock JSILc for using closure if you put it in a particular location relative to JSILc.exe (much like how it finds ffmpeg, etc.)

Just add 'arguments' to the list of reserved keywords in Util.cs to fix that name collision.

Mike-E-angelo commented 10 years ago

Coooooool... will do!

sebastiang commented 10 years ago

Apologies for wandering in uninvited... just thought I'd point out that the Closure compiler does a lot more than minify and validate. It also applies a number of optimizations to the code, most particularly dead code elimination

kg commented 10 years ago

Yeah, but as mentioned before they break all sorts of ECMAScript spec'd behavior, so they can't be applied to output from a compiler like JSIL.

panesofglass commented 10 years ago

Have you considered updates to the generated javascript output to align more closely with EcmaScript6 syntax? It seems the class and module syntax enhancements, in particular, could make some of the generated JavaScript simpler. Google's Traceur compiler or even TypeScript might be good targets for something like this. I think Traceur recently added support for type annotations similar to those provided in the Closure compiler.

(NOTE: in suggesting this, I understand this to be a gargantuan task, and though I would like to, I do not have the time to help prove out this idea. Apologies if this was an unhelpful comment.)

kg commented 10 years ago

ES6 class/module will not be of any particular use for JSIL code. They don't align with how it works very well at all.

You could definitely generate .d.ts files that provide type information. That's something I would leave up to people who plan to mix typescript with jsil output.

On Thu, Jul 17, 2014 at 9:26 AM, Ryan Riley notifications@github.com wrote:

Have you considered updates to the generated javascript output to align more closely with EcmaScript6 syntax? It seems the class and module syntax enhancements, in particular, could make some of the generated JavaScript simpler. Google's Traceur compiler or even TypeScript might be good targets for something like this. I think Traceur recently added support for type annotations similar to those provided in the Closure compiler.

(NOTE: in suggesting this, I understand this to be a gargantuan task, and though I would like to, I do not have the time to help prove out this idea. Apologies if this was an unhelpful comment.)

— Reply to this email directly or view it on GitHub https://github.com/sq/JSIL/issues/522#issuecomment-49330277.