RyanMarcus / dirty-json

A parser for invalid JSON
GNU Affero General Public License v3.0
307 stars 30 forks source link

Single quote is not escaped and totally breaks the export #16

Closed Benraay closed 5 years ago

Benraay commented 5 years ago

in this test I just removed second quote on coolCSS { "key": "<div class="coolCSS>some text</div>" } when parsing it exports this : "{"

it should only escape like this { "key": "<div class=\"coolCSS>some text</div>" }

RyanMarcus commented 5 years ago

Sorry, the GitHub notification got sent to my spam folder -- I'll look into this a little later this week.

RyanMarcus commented 5 years ago

There's a lot of ambiguity here.

{ "key": "xxx"yyy", "key2": "zzz" }

Ideally, we want this to parse to {"key": "xxx\"yyy", "key2": "zzz" }, but we could also interpret this as {"key": "xxx\"yyy\", \"key2\": \"zzz\" }" } (e.g., everything was meant to be included in key).

Depending on context, sometimes the parser will pick the first option, other times the second... this is the nature of ambiguous data. I think making the heuristics more intelligent would require a substantial rewrite of the parser.

This particular bug (parsing to {) is also due to the parser thinking the last lexeme is a quoted }... so the structure never appears closed. I'll fix this, but the behavior still might not be as expected.

RyanMarcus commented 5 years ago

The bug is "fixed" in 0.7.1.. Getting the semantics of this particular parse correct will have to put off until the next major version.