shapesecurity / shift-parser-js

ECMAScript parser that produces a Shift format AST
http://shift-ast.org/parser.html
Apache License 2.0
248 stars 28 forks source link

Proposal: named nodes #370

Closed bakkot closed 5 years ago

bakkot commented 6 years ago

Currently, if you want to refer to a particular node in an AST which is parsed from source text, you have to encode the path to it directly. This is awkward for several reasons: it requires writing down the names of all intermediate nodes, many of which you may not care about, and it is fragile to changes in source which may add or reorder nodes.

I propose that we add a way of naming some nodes which would allow one to address those nodes by name, rather than path.

Specifically, I propose we add a tool (possibly in this repository, possibly elsewhere) which would consume source text which included comments of the form /* shift # ExpressionStatement # foobar */ (syntax bikesheddable), where ExpressionStatement is any terminal node type (e.g. not Statement) and foobar is any identifier, and which would produce a map from those identifiers to their corresponding nodes in the AST (where "corresponding" means "immediately following").

Some details:


Use cases:

Templating:

In the simplest sort of template, you would just have placeholder nodes which would be replaced (possibly in a way depending on the original node). If you're willing to have mutation, you can do something like

const template = `
function go() {
  /* shift # ArrayExpression # emptyArray */ [].forEach(doSomething);
}
`;

const {tree, locations, comments} = parseScriptWithLocation(template);
const nodeMap = getNames(tree, locations, comments); // `getNames` being the function I am proposing in this issue 

nodeMap.get('emptyArray').elements = [new Shift.LiteralNumericExpression({ value: 42 })];

In practice you'd probably like to be immutable and a bit more reusable; an example such templater might look like

this. ```js class SimpleTemplater { constructor(nodeMap /* Map */, replacers /*: { [name]: Node -> Node } */) { this.nodeReplacers = new WeakMap; for (let [name, node] of nodeMap) { if (!(name in replacers)) { throw new Error(`No replacer was provided for it ${name}`); } this.nodeReplacers.set(node, replacers[name]); } } static apply(replacers, tree, nodeMap) { return reduce(new SimpleTemplater(nodeMap, replacers), tree); } } for (const typeName in require('shift-spec')) { SimpleTemplater.prototype['reduce' + typeName] = function(node, arg) { if (nodeReplacers.has(node)) { return nodeReplacers.get(node)(node, arg); } return LazyCloneReducer.prototype['reduce' + typeName](node, arg); // it might be a good idea to do this ourselves, using the "checked" AST constructors, so that problems in the template replacers would get caught here rather than producing incorrectly typed ASTs }; } ``` Usage would look like ```js const template = ` function go /* shift # FormalParameters # params */() { /* shift # ArrayExpression # emptyArray */ [].forEach(doSomething); } `; const templateReplacers = { emptyArray: () => new Shift.ArrayExpression({ elements: [new Shift.LiteralNumericExpression({ value: 42 })], }), params: params => new Shift.FormalParameters({ items: params.items.concat([new Shift.BindingIdentifier('newParam')]), rest: params.rest, }), }; const { tree, locations, comments } = parseScriptWithLocation(template); const nodeMap = getNames(tree, locations, comments); const realizedTemplate = SimpleTemplater.apply(templateReplacers, tree, nodeMap); ``` which would result in an the same AST as ```js function go(newParam) { [42].forEach(doSomething); } ```

A more advanced version of this could allow treating certain node names as positions in lists (for nodes which can have many children, like Block), and allow inserting or removing items at that position.

Tests:

Often in our tests we want to make an assertion about some particular node in an AST which is produced from source text. We currently just path to those nodes. Given this framework, you could instead have a small helper like

function assertNodeEquals(src, node) {
  const { tree, locations, comments } = parseScriptWithLocation(src);
  const nodeMap = getNames(tree, locations, comments);
  if (nodeMap.size !== 1 || !nodeMap.has('node')) {
    throw new Error('Test template is malformed');
  }
  assertDeepEquals(nodeMap.get('node'), node);
}

which could then be used like

assertNodeEquals('for (/* shift # VariableDeclaration # node */ let[a] in b)', {
  type: 'VariableDeclaration',
  kind: 'let',
  delcarators: [{
    type: 'VariableDeclarator',
    binding: {
      type: 'ArrayBinding',
      elements: [{
        type: 'BindingIdentifier',
        name: 'a',
      }],
    },
    init: null,
  }],
});
bakkot commented 6 years ago

Prior art: https://github.com/babel/babel/tree/master/packages/babel-template

(Well, kind of. Towards the same end, anyway.)

One useful idea from that is that the magic syntax (in our case, the magic comment) should be configurable.

bakkot commented 5 years ago

Done.