duckfly-tw / json-simple

JSON.simple - A simple Java toolkit for JSON
https://code.google.com/p/json-simple/
Apache License 2.0
0 stars 0 forks source link

characters \u2000 through \u20FF are being escaped, but they are not control characters #25

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Run this code:
System.out.println(JSONValue.toJSONString("\u2013\u2019\u201c\u201d"));

What is the expected output? What do you see instead?
I think you should see: "–’“”"
(four punctuation characters in quotes)
But I get this: "\u2013\u2019\u201C\u201D"

What version of the product are you using? On what operating system?
json_simple 1.1, linux

Please provide any additional information below.
I looked in the unicode spec, and they are not listed as control characters.

Original issue reported on code.google.com by nathan.k...@gmail.com on 3 Jun 2010 at 4:27

GoogleCodeExporter commented 9 years ago
It's not incorrect to escape a character even is not necessary according to 
JSON spec. The purpose is to eliminate some issues in a certain circumstance. 
But I agree to review it. May I ask what trouble it causes by escaping these 
characters?  

Original comment by fangyid...@gmail.com on 29 Nov 2011 at 3:25

GoogleCodeExporter commented 9 years ago
If you escape a character in the payload (content of any variable) you are 
changing the content.

Changing a text from "foo - bar" to "foo \u2013 bar" does not seem a good 
solution.

Characters between \u2000 and \u206F are general punctuation symbols.

I can't see any reason for not fixing this. 

Original comment by pablo.ba...@gmail.com on 30 May 2012 at 9:33

GoogleCodeExporter commented 9 years ago
For what its worth,  I suffer from the same problem. 

When the Josn decides to escape the charterer it lengthens the string.
And in Push Notification, you are limited by the size of Payload (256 byte 
total) which isn't much. 

So characters in this range \u2000 - \u206F are represent with 6 bytes and not 
2.

Original comment by meirda...@gmail.com on 31 Jul 2013 at 8:50