judofyr / timeless

A mixture of a blog, wiki and CMS, inspired by Christoffer Sawicki's Termos and James Adam's Vanilla.rb
http://timeless.judofyr.net/
Other
115 stars 14 forks source link

“JSON isn’t a JavaScript subset” #57

Closed mathiasbynens closed 13 years ago

mathiasbynens commented 13 years ago

You should add a note saying that the post is based on ES3.

Compare:

And note how both U+2028 and U+2029 are valid in ES5 string literals as part of a LineContinuation.

JSON may not be an ES3 subset, but it is an ES5 subset.

Thanks to John-David Dalton for the info.

judofyr commented 13 years ago

You're absolutely right. However, the Norwegian Constitution Day is coming up tomorrow, so I'm kinda busy. Feel free to send a pull request :-)

jdalton commented 13 years ago

Though, I don't think a LineContinuation can be created in JSONText because of the restrictions on the \ backslash character.

The key is in step 3 of ES5 section 15.12.2:

  1. Let unfiltered be the result of parsing and evaluating JText as if it was the source text of an ECMAScript Program but using JSONString in place of StringLiteral.

In the description of JSON.parse ES5 states:

JSON uses a more limited set of white space characters than WhiteSpace and allows Unicode code points U+2028 and U+2029 to directly appear in JSONString literals without using an escape sequence.

So basically JSONText will treat literal U+2028 as if it was the \u2028 escape sequence. This means '"\u2028"'; (as a string literal "<literal line terminator>";), should not error because it treats it like the '\u2028'; string literal form.

judofyr commented 13 years ago

Hm… I don't understand how ES5 supposedly allows it:

DoubleStringCharacter :: (See 7.8.4)
  SourceCharacter but not double-quote " or backslash \ or LineTerminator
  \ EscapeSequence
  LineContinuation

I'm not really interested in how ES5 specifies how to parse JSON, just how to parse JavaScript. It still seems to me that U+2028/U+2029 are not allowed inside a regular JS string, which means JSON is still not a subset.

jdalton commented 13 years ago

I replied to you in this screencast.

judofyr commented 13 years ago

Woah, sweet response.

I'll skip the details about "based on" vs. "subset", since my point was that generally people assume it's a subset and wanted to show how it's actually not a subset.

First of all: I'm not interested in how JSON is parsed by JavaScript tools (that is, a JSON parser available in a JavaScript environment), but how JavaScript parses strings. And I believe you've proved my point: They are not consistent. Example:

a = '["Hello\u2028"]'   // String which contains an (unescaped) U+2028 literal
eval(a)         // Unexpected token ILLEGAL
JSON.parse(a)   // ok

b = '["Hello\\\u2028"]' // String which contains an escaped U+2028 literal
eval(b)         // ok
JSON.parse(b)   // Unexpected token ILLEGAL

c = '["Hello\\u2028"]'  // String which contains an hex-encoded U+2028 character
eval(c)         // ok
JSON.parse(c)   // ok

If JSON is subset of ES5, then eval should behave as JSON.parse in all of those examples.

That said, I agree with your point that U+2028 and U+2029 are valid in a JavaScript as long as they are escaped.

jdalton commented 13 years ago

if JSON is subset of ES5, then eval should behave as JSON.parse in all of those examples

It's based on a subset, not a true subset.

That said, I agree with your point that U+2028 and U+2029 are valid in a JavaScript as long as they are escaped.

Are valid as part of an escape sequence of line continuation.

jdalton commented 13 years ago

Careful with your example b, because JSON is based on a subset of JavaScript one of the limitations is the use of the \ backslash. So the LineContinuation in b is not allowed in JSONString.

judofyr commented 13 years ago

It's based on a subset, not a true subset.

Exactly. Hence my post: Many assume it is a subset, but it's not. That breaks JSONP.

My response was more related to the "JSON may not be an ES3 subset, but it is an ES5 subset" claim in the original issue, since I missed how your "I don't think a LineContinuation can be created in JSONText because of the restrictions on the \ backslash character" was a way of saying "it's not a ES5 subset".

EDIT: Yes, I expected b to fail in JSON.

jdalton commented 13 years ago

I don't think you should assume JSON and JSONP have the same rules & restrictions as they are different formats.

judofyr commented 13 years ago

I don't think you should assume JSON and JSONP have the same rules & restrictions as they are different formats.

Yes, which is why I wrote this post. Many assumes JSON is subset of JavaScript (and therefore that JSONP is simply "callback(" + json + ")"). This is not true and people need to escape their characters properly.

jdalton commented 13 years ago

They could do:

"callback(JSON.parse('" + json + "'))"

assuming json has single quotes escaped

judofyr commented 13 years ago

I've already provided a solution in the article.

What exactly are you trying to say? This is not a comment form, it's an issue tracker for improving the quality of the articles. This is issue is related to ES3 vs ES5, and it seems to me that there are no differences when it comes to subset-ness or not. There does not exist a valid JSON string which is invalid in ES3, but valid in ES5.

I'll update the article to mention that U+2028/9 are valid as a part of a line continuation. If you have any other suggestions, please open a new issue or update this one (if it's related to ES3 vs ES5).

jdalton commented 13 years ago

Other suggestions would be to reference something that says JSON **is** a subset of JavaScript so your assertion that

All these years we’ve heard it over and over again: “JSON is a JavaScript subset”.

doesn't seem trumped up for the purpose of pushing a blog post.

judofyr commented 13 years ago

Two out of the three quotes I presented (from Wikipedia) said that it was a subset. But it's mostly based on what I've heard and seen other places (I don't have "hard" proof right here).

jdalton commented 13 years ago

Sounds like a bug in the Wikipedia post, you should correct it :)

laughinghan commented 10 years ago

The official JSON Website, www.json.org:

JSON is a subset of the object literal notation of JavaScript. Since JSON is a subset of JavaScript, it can be used in the language with no muss or fuss.

To convert a JSON text into an object, you can use the eval() function. eval() invokes the JavaScript compiler. Since JSON is a proper subset of JavaScript, the compiler will correctly parse the text and produce an object structure.

RFC 4627, "The application/json Media Type for JavaScript Object Notation (JSON)":

JSON's design goals were for it to be minimal, portable, textual, and a subset of JavaScript.

JSON is a subset of JavaScript, but it is a safe subset that excludes assignment and invocation.

A JSON text can be safely passed into JavaScript's eval() function (which compiles and executes a string) if all the characters not enclosed in strings are in the set of characters that form JSON tokens.

Of course those are all from "informative" rather than "normative" sections of the specs being quoted, but the fact that this assertion is made in both of the only official specs for JSON is pretty much as hard as proof can get that it is widely believed to be true.

@judofyr: You wrote a stupendously useful article bringing up an important and 100% correct point, and I'm incredibly sorry that people who are wrong on the Internet don't think a little harder about what they're saying before wasting your time, and I'm amazed at your graciousness in dealing with them.