benjamn / recast

JavaScript syntax tree transformer, nondestructive pretty-printer, and automatic source map generator
MIT License
5.01k stars 350 forks source link

Transforming IIFE results in extra parentheses #327

Open nene opened 8 years ago

nene commented 8 years ago

When transforming normal function to arrow function:

const recast = require('recast');

const ast = recast.parse('(function(){})()');
ast.program.body[0].expression.callee.type = 'ArrowFunctionExpression';

console.log(recast.print(ast).code);

the above outputs:

((() => {}))()

instead of the expected:

(() => {})()

Recast seems to correctly detect that this IIFE needs to be wrapped in parens, but fails to detect that there are already parenthesis present. When I just let Recast print a plain AST, it gets the parenthesis right:

console.log(recast.print({
    type: 'ExpressionStatement',
    expression: {
        type: 'CallExpression',
        callee: {
            type: 'ArrowFunctionExpression',
            params: [],
            body: {
                type: 'BlockStatement',
                body: [],
            }
        },
        arguments: [],
    }
}).code);

Seems to be similar to an old issue #81

benjamn commented 8 years ago

Recast seems to correctly detect that this IIFE needs to be wrapped in parens, but fails to detect that there are already parenthesis present.

This is exactly right. Since getting this right is tricky, Recast errs on the side of adding extra parens, instead of sometimes not parenthesizing when necessary.

Detecting whether parens are already present is tricky because most parsers do not preserve information about parentheses in the AST (Babylon is the exception). If you have any good ideas for how to detect reliably whether a given node is wrapped in parentheses, please let me know!

nene commented 8 years ago

Hmm... Does this mean that there won't be this problem when running Recast with Babylon?

I've been considering switching to Babylon for some time.

benjamn commented 8 years ago

Using Babylon won't help, because I haven't added any Babylon-specific logic to Recast. I believe Recast should work equally well with all parsers, rather than implicitly preferring the parser most willing to defy conventions. If everyone agreed that parentheses should be preserved in the AST, I would love to get ParenthesizedExpression added to some kind of AST spec, and then Recast could rely on it here.

nene commented 8 years ago

Ah... Thanks for clarification.

Phoenixmatrix commented 6 years ago

We just hit this issue too running some codemods across a few hundred repos. There's a lot of things that trigger the parenthesis issue. As @benjamn mentionned, the info is in the babylon ast but not in other ASTs.

The part that I don't understand is why this affects unchanged code. If I have a no-op transform and print (not pretty print), the parenthesis gets messed up (valid, but change the original code). Since usually recast ignores unchanged nodes, I'm a little confused as to why this is happening (reading the linked issues seems to say that this confuses recast into thinking there IS a change when there is not?).

recast seems to happily reprint anything else when I don't change the AST, no matter how wildly formatted (the ast doesn't contain every space I type either, but those aren't an issue). Why are these parenthesis special?

Phoenixmatrix commented 6 years ago

Hmm, edit to above, pasting the arrow function in the example above doesn't actually change anything (thus it works correctly) unless I modify the ast node, so I think I'm having a different issue.

benjamn commented 6 years ago

Although there are definitely larger issues of over-parenthesization at play here, which I am currently working to solve, I do not think the behavior reported in this issue should be considered a bug, since

(() => {}())

would not be valid syntax, and moving the () outside the larger parenthesized expression is a matter of taste and human judgment where Recast cannot safely take a position.

The ArrowFunctionExpression needs parentheses because it's the callee of a CallExpression, plain and simple. Achieving the ideal output of (() => {})() would require an additional step of actively removing an outer layer of parentheses that was already present. I think that's going "too far," since it means changing the formatting of unmodified ancestor nodes (something Recast strives not to do, in general).

With that said, if you want to make sure code like this gets re-parenthesized, you can explicitly force pretty-printing the CallExpression by setting callExpression.original = null:

const ast = recast.parse('(function(){})()');
const callExpression = ast.program.body[0].expression;
callExpression.callee.type = 'ArrowFunctionExpression';
callExpression.original = null;
console.log(recast.print(ast).code);

which will print

(() => {})();

as desired.

nene commented 6 years ago

Thanks @benjamn, resetting .original on parent node solves my issue.

I also agree that Recast behaves correctly here.