Closed 635abe7e-16c3-43be-adf1-c703ae129470 closed 10 years ago
The JSON spec (http://www.json.org/) does not allow unescaped control characters. (See the railroad diagram for strings and the grammar on the right.) If json.dumps is called with ensure_ascii=False, it fails to escape control codes in the range U+007F to U+009F. Here's an example:
>>> import json
>>> import unicodedata
>>> for i in range(256):
... jsonstring = json.dumps(chr(i), ensure_ascii=False)
... if any(unicodedata.category(ch) == 'Cc' for ch in jsonstring):
... print("Fail:",repr(chr(i)))
Fail: '\x7f'
Fail: '\x80'
Fail: '\x81'
Fail: '\x82'
Fail: '\x83'
Fail: '\x84'
Fail: '\x85'
Fail: '\x86'
Fail: '\x87'
Fail: '\x88'
Fail: '\x89'
Fail: '\x8a'
Fail: '\x8b'
Fail: '\x8c'
Fail: '\x8d'
Fail: '\x8e'
Fail: '\x8f'
Fail: '\x90'
Fail: '\x91'
Fail: '\x92'
Fail: '\x93'
Fail: '\x94'
Fail: '\x95'
Fail: '\x96'
Fail: '\x97'
Fail: '\x98'
Fail: '\x99'
Fail: '\x9a'
Fail: '\x9b'
Fail: '\x9c'
Fail: '\x9d'
Fail: '\x9e'
Fail: '\x9f'
json.dumps works correctly in this case.
Both json/application rfc 1 and ecma json standard 2 say:
All characters may be placed within the quotation marks, except for the characters that must be escaped: quotation mark (U+0022), reverse solidus (U+005C), and the control characters (U+0000 through U+001F).
i.e., only a subset (00-1F) of control characters must be escaped in json string
Ah, sorry for the confusion.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at =
created_at =
labels = ['invalid', 'type-bug', 'library']
title = "json.dumps with ensure_ascii=False doesn't escape control characters"
updated_at =
user = 'https://bugs.python.org/weeble'
```
bugs.python.org fields:
```python
activity =
actor = 'ned.deily'
assignee = 'none'
closed = True
closed_date =
closer = 'ned.deily'
components = ['Library (Lib)']
creation =
creator = 'weeble'
dependencies = []
files = []
hgrepos = []
issue_num = 21194
keywords = []
message_count = 3.0
messages = ['215868', '215898', '215923']
nosy_count = 5.0
nosy_names = ['rhettinger', 'pitrou', 'ezio.melotti', 'weeble', 'akira']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue21194'
versions = ['Python 3.4']
```