Closed GoogleCodeExporter closed 8 years ago
Marking as not a bug because I believe this is working as expected. Consider
the following test:
public function testDecodeStringWithEscapedBackslashAndNewLines():void
{
// Contains escaped and unescaped LF
var innerString:String= "line1\nline2\\nmore line2";
// Decode an object that has the inner string inside of it
var o:* = JSON.decode( JSON.encode( innerString ) );
assertEquals( innerString, o );
}
This test case passes. The innerString value is exactly the same after going
through an encode/decode
process.
In your example, the reason you're seeing "strange" behavior is because the
string is escaped before it has a
chance to be decoded by the JSON library. The string you're actually passing
into JSON.decode looks like this:
"line1{newline character}line2{backslash character}nline3"
When the JSON decoder runs, it finds the backslash character immediately
followed by an "n" and it converts
that to a {newline character}. So, after the decode, the result ends up being:
"line1{newline}line2{newline}line3". That explains why the result differs from
the original string you specified.
Remember that a string must be properly encoded in order for it to be decoded
correctly.
Original comment by darron.schall
on 8 Jul 2009 at 7:02
So what you're saying is that JSON.decode will accept malformed JSON? Is that
how it should work? I
think its important that the decoder reject bad JSON.
We had a process that was emitting bad JSON (just like described in the bug).
JSON.decode would happily
accept it. It wasn't until we hooked up a java json decoder (which rejected
the bad json) that we even
realized that there was a problem.
Original comment by paleozogt
on 8 Jul 2009 at 7:11
I understand what you're saying now. The JSON string "line1{newline
character}line2" is technically malformed
because the {newline character} is not escaped to "\n". The JSON decoder is
not throwing an error when
processing this malformed string.
I'll update the tokenizer to throw a parse error in strict mode when unescaped
control characters (\u0000
through \u001F) are found in strings.
In non-strict mode, I won't change the behavior because the idea of the
non-strict mode is to parse the JSON
the best that we can even if the string is malformed.
Original comment by darron.schall
on 8 Jul 2009 at 7:22
Fixed in r91. Modified readString in the tokenizer to look for control
characters.
Commit message: In JSON strict mode, when a string contains an unescaped control
character (0x00-0x1F) a parse error is now thrown because the spec indicates
that strings cannot contain unescaped control characters.
In non-strict mode, the error is ignored and the control character is "passed
through" to the decoded string value.
Original comment by darron.schall
on 8 Jul 2009 at 7:46
Original issue reported on code.google.com by
paleozogt
on 30 Apr 2009 at 5:39