getify / literalizer

Specialized heuristic lexer for JS to identify complex literals
16 stars 2 forks source link

Multi-line string literals "lose" their multi-line'ness #23

Closed getify closed 11 years ago

getify commented 11 years ago

file.js:

var a = "this is \
a multi-line string";

and then

LIT.lex(file_js_contents);

results in:

[
    {
        "type": 0,
        "val": "var a = "
    },
    {
        "type": 2,
        "val": "\"this is a multi-line string\""
    },
    {
        "type": 0,
        "val": ";"
    }
]

You can see that the string-literal value has no new-line character in it in this example. This matches with how the JS engine would interpret that same code (that is, the engine would not see any end-of-line escaped new-line character, but rather it's seen as a line continuation by the engine).

It appears that esprima and acorn account for this (and possibly other "information loss") by tracking a raw value on nodes as well. Should do the same, for the string literal segment at least.