fstirlitz / luaparse

A Lua parser written in JavaScript
https://fstirlitz.github.io/luaparse/
MIT License
459 stars 91 forks source link

Regression when parsing TableKey #86

Closed matthargett closed 4 years ago

matthargett commented 4 years ago

In prettier/plugin-lua, there is a test in the suite that processes fine with luaparse 0.2.1, but after upgrading to 0.3.0 is gets a null for the value instead.

Focused sample code that triggers the problem:

local function get_status(status)
  local smap = {
    ['success'] = 'success',
    ['pending'] = 'pending',
    ['failure'] = 'failure',
    ['error'] = 'error',
    ['true'] = 'success',
    ['false'] = 'failure',
    ['nil'] = 'error',
  }
  return smap[tostring(status)] or 'error'
end

Node output for 0.2.1, node that value is correctly the 'success' string key:

   {
      type: 'TableKey',
      key: {
        type: 'StringLiteral',
        value: 'success',
        raw: "'success'",
        range: [ 56, 65 ]
      },
      value: {
        type: 'StringLiteral',
        value: 'success',
        raw: "'success'",
        range: [ 69, 78 ]
      },
      range: [ 55, 78 ]
    }

After upgrading to 0.3.0, note that the value is now incorrectly null:

    {
      type: 'TableKey',
      key: {
        type: 'StringLiteral',
        value: null,
        raw: "'success'",
        range: [ 56, 65 ]
      },
      value: {
        type: 'StringLiteral',
        value: null,
        raw: "'success'",
        range: [ 69, 78 ]
      },
      range: [ 55, 78 ]
    }

The good news is that everything else seems to work great. Just due to this null value, there is a crash in just this one file in the core test suite.

matthargett commented 4 years ago

Did a search through the commits, and the last good commit is fb941d31e60a2600de34faf18be4f4dd5e21e8c7 -- it's the most recent commit to luaparse.js that introduces the problem. Will see if I can fix it.

matthargett commented 4 years ago

I see now. Defaulting the string encoding to 'none' sets discardStrings to true. I can set the encoding manually in prettier/plugin-lua, but I'd still recommend that the behavior not regress.

fstirlitz commented 4 years ago

Duplicate of #82. Also related to #68.

The previous behaviour was wrong too (and silently so), so it’s disputable if this is really a regression. But ultimately there are no good solutions here. I chose this one to make sure the user is aware there is a problem here at all instead of relying on a subtly wrong solution. (Previously I mangled strings into a WTF-8, which got me comments like https://github.com/fstirlitz/luaparse/issues/77#issuecomment-548989589; and further before, string literals weren’t even consistently decoded: https://github.com/fstirlitz/luaparse/pull/34#discussion_r72285922.)

I think a code reformatter should be able to do its work without understanding the contents of the string anyway; the .raw property contains the string literal as it appears in the input, which the reformatter can emit unchanged. It won’t be able to perform transformations that break strings apart or transform their particular representation in source code (like "\xf0\x9f\x92\xa9\x22""\u{1f4a9}"), but then those probably shouldn’t be performed anyway.