python / cpython

The Python programming language
https://www.python.org
Other
63.19k stars 30.26k forks source link

MappingProxy objects should JSON serialize just like a dictionary #79039

Open ca4c21ce-6a55-4640-b321-e45d865e7301 opened 6 years ago

ca4c21ce-6a55-4640-b321-e45d865e7301 commented 6 years ago
BPO 34858
Nosy @rhettinger, @etrepum, @merwok, @serhiy-storchaka, @nathanielmanistaatgoogle

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = 'https://github.com/etrepum' closed_at = None created_at = labels = ['3.8', 'type-bug', 'library'] title = 'MappingProxy objects should JSON serialize just like a dictionary' updated_at = user = 'https://bugs.python.org/MichaelSmith2' ``` bugs.python.org fields: ```python activity = actor = 'eric.araujo' assignee = 'bob.ippolito' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'Michael Smith2' dependencies = [] files = [] hgrepos = [] issue_num = 34858 keywords = [] message_count = 10.0 messages = ['326756', '326761', '326845', '326849', '343617', '370826', '370828', '370830', '370844', '379041'] nosy_count = 7.0 nosy_names = ['rhettinger', 'bob.ippolito', 'eric.araujo', 'serhiy.storchaka', 'Nathaniel Manista', 'Michael Smith2', 'zgoda.mobica'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue34858' versions = ['Python 3.8'] ```

ca4c21ce-6a55-4640-b321-e45d865e7301 commented 6 years ago

If a mappingproxy object is a read-only proxy to a mapping, would it make sense for them to JSON serialize just like the mapping they come from? Currently, json.dumps throws the "I don't know how to serialize this" error:

$ python -c 'import json
> import types
> json.dumps(types.MappingProxyType({}))'
Traceback (most recent call last):
  File "<string>", line 3, in <module>
  File "/usr/lib64/python3.6/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib64/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib64/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib64/python3.6/json/encoder.py", line 180, in default
    o.__class__.__name__)
TypeError: Object of type 'mappingproxy' is not JSON serializable
rhettinger commented 6 years ago

IIRC, there was an early decision to handle only exact types and their subclasses (plus tuples). For example, os.environ and instances of collections.ChainMap are not JSON serializable.

I believe the reason was that it made encoding faster, more predictable, and more likely to match round-trip expectations. For those wanting more generality, there are at least two options. The simplest option is to coerce the input to supported type. A more complete solution is to write a subclass of JSONEncoder to pass into json.dump() as the *cls* argument (there are examples of how to do this in the docs).

For the specific case of mappingproxy, there is another issue. Multiple components of a class dict are not all JSON serializable, so you have the same problem yet again with getset_descriptor objects, member objects, and various slot wrappers.

ca4c21ce-6a55-4640-b321-e45d865e7301 commented 6 years ago

OK, I appreciate the response. The MappingProxy objects I am working with are in a nested data structure, so accessing them to coerce them directly isn't feasible. I created

class MappingProxyEncoder(JSONEncoder):
  def default(self, obj):
    if isinstance(obj, MappingProxyType):
      return obj.copy()
    return JSONEncoder.default(self, obj)

and using that works fine for json.dumps (passes all my tests, at least). But I suspect that obj.copy() is more expensive than the code I'm replacing, which is a local implementation of ImmutableDict I was trying to do away with, following a readthrough of PEP-416. I wonder if there's a cheaper way to get at the underlying dictionary.

I suspect I'm not the only python user who foolishly expected MappingProxy to be a drop-in replacement for local ImmutableDict implementations. Maybe this ticket will be useful for others to read.

serhiy-storchaka commented 6 years ago

Using the default method or argument is a proper way of serializing custom types to JSON. There are a number of issues fr adding support for serializing different types by default. I work on a patch that will allow to make different types serializable in a more convenient way.

b197a89f-a0be-482b-857a-1e1f9aedc70f commented 5 years ago

In particular case of lazy type proxy objects the result of dump() may be different than dumps() because dump does iterencode() over data and possibly causing proxy object evaluation. In such case dumps() will raise TypeError but dump() will not.

d05aef1e-abc2-4a6e-b042-f2fd671dd9f2 commented 4 years ago

There's a great deal more additional discussion about this over at https://discuss.python.org/t/json-does-not-support-mapping-and-mutablemapping/2829, but I would like to heartily "hear hear" this issue for the way it is weird that

json.dumps(frozendict.frozendict({'3':5}))

does not evaluate to the string "{'3': 5}" and it's something that makes writing according to and teaching others to code according to the make-your-code-safer-by-using-immutability-in-all-possible-things principle.

d05aef1e-abc2-4a6e-b042-f2fd671dd9f2 commented 4 years ago

Err, "... it's something that complicates writing code according to...", sorry.

rhettinger commented 4 years ago

+1 for reconsidering whether the json module should support mappings that don't inherit for dict. The decision not to support general mappings is ancient and predates a lot of modern tooling.

The only worry is that the objects won't round-trip — you could store a ChainMap but you wouldn't get a ChainMap back :-(

8bb2fe13-d245-4fef-8aa2-70d7108db3c3 commented 4 years ago

I would certainly reconsider it at this point, I think a bona fide ABC specific to JSON encoding would be a good way to do it. simplejson has two ways to do this, the for_json parameter which will use a for_json() method on any object as the JSON representation if present, and the namedtuple_as_object parameter which will do the same for objects with an _asdict() method. As part of the standard library it might make more sense to rename for_json() to __json__() or similar.

It is a bit unfortunate that you can't guarantee round-trip on deserialization, but that has always been the case. To get round-tripping (without tagging everything that has been encoded in some way like pickle does), you really should be working with a schema of some kind.

merwok commented 4 years ago

See also bpo-30343