fstirlitz / luaparse

A Lua parser written in JavaScript
https://fstirlitz.github.io/luaparse/
MIT License
459 stars 91 forks source link

StringLiteral.value is sometimes null #110

Closed Hexcede closed 3 years ago

Hexcede commented 3 years ago

I'm not sure if this is always the case for all string literals, but the following is resulting in a value of null on the StringLiteral object:

local someValue = something:Something("abc")

The result is the following:

{
  type: 'StringLiteral',
  value: null,
  raw: '"abc"'
}

This is on lua version 5.1.

As a workaround I am currently using this (probably messy) regex replacement to remove the surrounding string characters: stringLiteral.raw.replace(/\[?\[?["']?([^"'\]]+)["']?\]?\]?/m, "$1")

fstirlitz commented 3 years ago

Duplicate of #82. This is intentional, and a result of the fact that JavaScript strings are wide, 16-bit strings (customarily UTF-16) while Lua strings are bytestrings. This mismatch creates a problem of what to do when there isn’t an obvious mapping between the two: e.g. what (JavaScript) wide string should correspond to the (Lua) bytestring '\255', or what (Lua) bytestring should correspond to the (JavaScript) wide string '\udead', if any.

If you want to access raw string values, with all escape sequences interpreted, you need to choose an encoding with which you read Lua source code and set the encodingMode option accordingly. You will then get string literal values interpreted according to the same encoding.

I’m not too happy about this either, but this is the best I could do in this language. If this were Python, I would simply require using the bytes type instead of str for parser input. JavaScript has no such thing, though.