fstirlitz / luaparse

A Lua parser written in JavaScript
https://fstirlitz.github.io/luaparse/
MIT License
459 stars 91 forks source link

Reflect parentheses that discard extra values in the AST #79

Open fstirlitz opened 4 years ago

fstirlitz commented 4 years ago

Currently, the latest released version does not distinguish e.g. f() from (f()) in the AST. These expressions are not equivalent; the former evaluates to all the return values of f(), and the latter evaluates to only the first of them (the rest are discarded). Current git master distinguishes them by the inParens field, but that is just a workaround for #24 and I expect to remove it when that issue is fixed properly.

A more principled approach to distinguishing these expressions is desired. I may keep the inParens field, or I may introduce a new kind of node that expresses the operation of discarding extra values.

fstirlitz commented 4 years ago

There are two kinds of expressions which may evaluate to a value tuple: function calls and the vararg literal, .... There are also five places where discarding extra values matters, all of which are in a context where a multiple-value expression appears as the last one in a comma-delimited list of expressions:

In the last of these, parenthesising the only expression also has the effect of preventing tail-call optimisation. This suggests a representation as a new type of node, so that tail calls can be detected simply by checking whether a function-call node appears as a direct child of the return node. This will be a breaking change for users who are not prepared to deal with unknown node types; luamin by @mathiasbynens would be one. In this case it's arguably a good thing anyway, because as of now luamin will just perform minification incorrectly on code affected by this defect.

Alternatively, I might treat parentheses as a circumfix unary operator, and piggyback on the UnaryExpression node type to reflect it in the AST.

Lua manual links: 5.1 §2.5, 5.2 §3.4, 5.3 §3.4

arnoson commented 2 years ago

Im currently updating a fork of lua-fmt that uses on old version of this library. The needsParens method currenlty relies on node.inParens. Could you point me in the right direction of how to rewrite this methods using the new version of luaparse?

needsParens() {
  const parent = this.getParent() as luaparse.Node
  const value = this.getValue() as luaparse.Node

  let inParens = false
  switch (value.type) {
    case 'FunctionDeclaration':
    case 'Chunk':
    case 'Identifier':
    case 'BooleanLiteral':
    case 'NilLiteral':
    case 'NumericLiteral':
    case 'StringLiteral':
    case 'VarargLiteral':
    case 'TableConstructorExpression':
    case 'BinaryExpression':
    case 'LogicalExpression':
    case 'UnaryExpression':
    case 'MemberExpression':
    case 'IndexExpression':
    case 'CallExpression':
    case 'TableCallExpression':
    case 'StringCallExpression':
      inParens = value.inParens || false
  }

  if (parent) {
    /*
          If this UnaryExpression is nested below another UnaryExpression, wrap the nested expression in
          parens. This not only improves readability of complex expressions, but also prevents `- -1` from
          becoming `--1`, which would result in a comment.
      */
    if (value.type === 'UnaryExpression' && parent.type === 'UnaryExpression') {
      inParens = true
    }
  }

  return inParens
}
arnoson commented 2 years ago

I found a solution (mixing prettier's lua and php plugin approaches). The only thing that is missing for now is distinguishing fun() from (fun()) so I can keep the parens. Thanks for the great work!

dlbd commented 2 years ago

Any update on this? I was working on a custom Lua minifier with a somewhat better compression ratio than luamin, and am currently stuck with a luaparse version from 2019 (before .inParens was removed).

cohler commented 1 year ago

This is a MAJOR BUG giving a totally useless and INCORRECT parse. Why hasn't anyone fixed this yet?