serde-rs / json

Strongly typed JSON library for Rust
Apache License 2.0
4.85k stars 554 forks source link

Unescaped tab, line feed, carriage return should not be accepted in strings #90

Closed dtolnay closed 8 years ago

dtolnay commented 8 years ago

From JSON standard:

Insignificant whitespace is allowed before or after any token. The whitespace characters are: character tabulation (U+0009), line feed (U+000A), carriage return (U+000D), and space (U+0020). Whitespace is not allowed within any token, except that space is allowed in strings.

dtolnay commented 8 years ago

These are the only checks from JSON_checker that we fail. Rustc-serialize correctly rejects unescaped whitespace in strings.

oli-obk commented 8 years ago

Is it a problem if our parser/deserializer is more lenient than the standard, as long as our serializer produces correct json?

StefanoD commented 8 years ago

@oli-obk So, you want to guess what the sender wanted to send you? Can be dangerous...

dtolnay commented 8 years ago

I think we should aim to accept valid JSON and reject invalid JSON. I would make one exception which is I think it is okay for us to accept types other than list and map at the root level.

oli-obk commented 8 years ago

but if accepting valid json requires additional code and conditions, it's additional code we need to maintain and test + it slows down the regular path. If the correct way is faster/easier (like with forbidding trailing commas), then it's fine with me.

maciejhirsz commented 8 years ago

Kind of related, I've been looking at control characters:

The control characters U+0000–U+001F and U+007F come from ASCII

0x7F is not marked as U in the LUT.

maciejhirsz commented 8 years ago

I've been roaming around, since I'm looking for more universal testing suite for myself.

@dtolnay

I would make one exception which is I think it is okay for us to accept types other than list and map at the root level.

That's not an exception, both ECMA 404 and RFC 7159 state that JSON text has to conform to the grammar of a JSON value, which permits strings, numbers and and the 3 literals.

@oli-obk

it's additional code we need to maintain and test + it slows down the regular path.

I've done this with a LUT and it didn't slow down the regular path at all. The logic is pretty trivial, there isn't much to maintain or test.

dtolnay commented 8 years ago

This was fixed in #98/#100.