pass AST trees between tasks for speed

dylang commented 10 years ago

Manifesto

Many of us left or avoided Java because they didn't want the pain of a compile step before seeing changes. Our blessed modern JavaScript tooling comes with the pain of potentially lengthy compiles. The ultimate goal of this ticket is to investigate the possibility of reducing the JavaScript compile/build time down to near-zero.

Concept

We have have multiple tasks in a row that all must re-build an AST trees. Can remove this redundancy?

Example built steps:

grunt-jshint
grunt-jscs
grunt-browserify
grunt-ngmin
grunt-uglify
Today

Each of these tasks opens files, possibly based on a glob pattern, builds a JavaScript AST, does some work, converts the AST back to JavaScript, and writes the files to the file system (or passes a stream of the file contents to the next task). Then the next task re-reads the files, parses them, etc.

Hypothesis

If an AST tree could be passed from task to task it would cut down time considerably. No need to re-parse content that was machine-built in a previous step.

Possible implementation

I think plugins would need a standard way to tell the task runner what they prefer as input and what they can output.

{
  preferInput: [ 'ast', 'stream', 'glob' ],
  availableOutput: [ 'ast', 'stream', 'filesystem' ]
}

The task runner would look at the plugins and tell the plugin what to output based on what the next task can take as input.

When changes happen

When code changes, ideally only the files that were changed should need to be re-read, parsed, and the changes injected into the already-existing AST tree.

Future

Tasks could understand what parts of the AST tree were modified and act accordingly. For example, Uglify could re-compress just the changed content.

Disclaimer

I have not tried any of this in code. I'm not even sure if AST trees can be shared from one tool to another. If they can't today, this might encourage standardizing the practice. I'm sure my team would be fine switching from browserify and uglify to alternative tools that did the same thing but were significantly faster because they could share AST trees.

I know AST tree is like saying "ATM machine", but it seemed more clear for a wider audience.

tkellen commented 10 years ago

That's a pretty interesting idea Dylan! We're definitely going to include an in-memory representation of files in the spec (most likely a combination of https://github.com/tkellen/record & http://github.com/wearefractal/vinyl). I could see task authors storing a file's AST as a property on a record--then supporting tasks could optionally try to read that before they parse the contents of the file again.

villadora commented 10 years ago

I was trying to wrote some tool to do obfuscation staff on javascript, and there are many passes need to be managed during the processing. The pipelines in the compiler is very similar to tasks, like a normal minify process will include translating into AST, eliminating unused code, reordering declarations and last generating code. And I pretty sure many passes could be reused in different purposes.

I know the node-task is not targeting for the passes in detail level, but is there any plan to provide some way to extend the input between tasks? it could be more widely useful.

node-task / spec