maciejhirsz / json-rust

JSON implementation in Rust
Apache License 2.0
566 stars 63 forks source link

Don't fail when parsing an escaped lone surrogate #149

Open staktrace opened 6 years ago

staktrace commented 6 years ago

Recently the Spidermonkey JS engine implemented a change to JSON.stringify that emits lone unicode surrogates as escaped rather than unescaped. This causes a parsing failure with json-rust. The people behind that change feel that this should be fixed in the parsing library. See https://bugzilla.mozilla.org/show_bug.cgi?id=1496747 and https://github.com/tc39/proposal-well-formed-stringify/issues/13 for additional details. I have a test case in https://github.com/staktrace/jsontest that demonstrates the error on all three Rust JSON parsing libraries.

staktrace commented 6 years ago

Sorry, I got this wrong. The suggestion was to make everything fail consistently, which I believe already happens in json-rust.

staktrace commented 6 years ago

Although... one thing that would help here for my use-case would be the ability to have json-rust automatically replace these lone surrogates with the U+FFFD replacement character. This would probably be behind some sort of runtime option. My reading of the JSON syntax on JSON.org doesn't actually prohibit lone surrogate characters from being valid JSON, but Rust as a language doesn't allow them in strings. As the library that bridges these two, it might make sense to provide alternative ways to handle this incompatibility. Something to consider.