Open 64fe4661-1a60-4325-9464-32eca8026b44 opened 11 years ago
According to documentation of json.dump(), concerning its 'default' option:
default(obj) is a function that should return a serializable version of obj or raise TypeError. The default simply raises TypeError.
But this function is actually never applied to serialized dictionary keys:
>>> def default(obj):
... if isinstance(obj, bytes):
... return obj.decode('ascii')
... raise ValueError
...
>>> json.dumps(b'asdf')
Traceback (most recent call last):
...
TypeError: b'asdf' is not JSON serializable
>>> json.dumps(b'asdf', default=default)
'"asdf"'
>>> json.dumps({b'asdf' : 1}, default=default)
Traceback (most recent call last):
...
TypeError: keys must be a string
>>> json.dumps({1 : b'asdf'}, default=default)
'{"1": "asdf"}'
(bytes are used purely for the purpose of example) Such behavior should be either documented or corrected. Patch correcting python implementation of json attached.
Oops, my patch disables 'skipkeys' argument of dump. Another version attached.
Are there already tests that cover the changes in your patch? If not, could you add them?
Proposed tests attached.
Will this be merged? I also believe it is an unexpected behavior to not serialize dictionary keys when the default option is used.
See also
https://github.com/simplejson/simplejson/issues/42 https://github.com/simplejson/simplejson/issues/100 https://github.com/simplejson/simplejson/issues/103
And allowing non-string keys leads to other issue:
https://github.com/simplejson/simplejson/issues/77
Thus there is an argument for disallowing serializing non-string keys.
Regarding the issues mentioned in https://github.com/simplejson/simplejson/issues/77, they already apply with the current implementation anyway (true is serialized as 'true'), so users must already be careful.
The JSONEncoder with default parameters could definitely keep rejecting complex keys, but an optional 'encode_complexkeys=False' parameter could be added to \_init__ to provide this functionality. There would be zero performance impact as the parameter check would only be done if all else has failed instead of raising the TypeError (the same way _skipkeys does).
In that respect, the patch of this issue would need to be amended (and the C version would also need to be changed).
I am trying to fork the json core module to achieve this (to not have to wait for the next Python release, assuming it gets implemented), if you are familiar with the packaging process of core modules into standalone modules, your advice would be much appreciated (and could well help others contribute to Python): http://stackoverflow.com/questions/36498527/how-to-create-a-customized-version-of-a-core-python-module-that-includes-c-code
Reproduced on 3.11.
Is still the view of the developers the one described in https://github.com/simplejson/simplejson/issues/100? (That is, that non-strings keys in dictionaries should be disavowed for ambiguity concerns).
Would a patch to document this exception in the documentation would be welcomed?
I understand the concerns regarding default behavior. But the restriction of string-only keys is on the JSON side, whereas complex object keys are perfectly valid in Python dictionaries. That is the very reason why we would need an encoder.
Probably the encoder should not automatically try to stringify everything it encounters (as the example of 1 and "1"), but if a user explicitly provides a specific custom type in default
, the intention seems quite clear and we should assume the user is aware of and responsible for the consequences?
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['type-bug', 'library', '3.9', '3.10', '3.11']
title = "json.dump() ignores its 'default' option when serializing dictionary keys"
updated_at =
user = 'https://github.com/tifv'
```
bugs.python.org fields:
```python
activity =
actor = 'iritkatriel'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation =
creator = 'july'
dependencies = []
files = ['31437', '31450']
hgrepos = []
issue_num = 18820
keywords = ['patch']
message_count = 8.0
messages = ['195957', '195958', '196022', '196036', '263013', '263019', '263025', '415093']
nosy_count = 11.0
nosy_names = ['rhettinger', 'bob.ippolito', 'pitrou', 'ezio.melotti', 'cvrebert', 'july', 'docs@python', 'berker.peksag', 'serhiy.storchaka', 'Erwin Mayer', 'iritkatriel']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue18820'
versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']
```
Linked PRs