lidatong / dataclasses-json

Easily serialize Data Classes to and from JSON
MIT License
1.34k stars 150 forks source link

[BUG] Generic Collections fields - not respecting `default_factory` `field` specifications #505

Open cagantomer opened 6 months ago

cagantomer commented 6 months ago

Description

At times, a dataclass field may be type-annotated with a generic type (e.g. MutableMapping) and defined with a default_factory. In such case, when deserializing, the code will try to create an instance of the annotated type which is illegal and causes an error.

Code snippet that reproduces the issue

from typing import MutableMapping

from dataclasses import dataclass, field
from dataclasses_json import DataClassJsonMixin

@dataclass
class MyClass(DataClassJsonMixin):
    field1: MutableMapping[str, str] = field(default_factory=dict)

if __name__ == '__main__':
    c = MyClass()
    data = c.to_dict()
    MyClass.from_dict(data)

This will produce the following error:

Traceback (most recent call last): File "/Users/tomercagan/dev/gen2projection/./bin/error_json.py", line 14, in MyClass.from_dict(data) File "/Users/tomercagan/dev/venvs/json-bug/lib/python3.10/site-packages/dataclasses_json/api.py", line 70, in from_dict return _decode_dataclass(cls, kvs, infer_missing) File "/Users/tomercagan/dev/venvs/json-bug/lib/python3.10/site-packages/dataclasses_json/core.py", line 220, in _decode_dataclass init_kwargs[field.name] = _decode_generic(field_type, File "/Users/tomercagan/dev/venvs/json-bug/lib/python3.10/site-packages/dataclasses_json/core.py", line 300, in _decode_generic res = materialize_type(xs) TypeError: MutableMapping() takes no arguments

P.S. There is a workaround for this - to define a decoder that builds the expected actual type (i.e. dict above) which circumvents this situation, but it is kind of a drag:

from typing import MutableMapping

from dataclasses import dataclass, field
from dataclasses_json import DataClassJsonMixin, config

@dataclass
class MyClass(DataClassJsonMixin):
    field1: MutableMapping[str, str] = field(
        default_factory=dict, 
        metadata=config(decoder=lambda x: x),  # a lambda that doesn't do anything because the underlying value is already a dict
    )

if __name__ == '__main__':
    c = MyClass()
    data = c.to_dict()
    MyClass.from_dict(data)

Describe the results you expected

Ideally, the deserialization should instantiate a instance using the specified default_factory and not try to create the generic type from the annotation.

From some playing around with the code, it seems that in core.py:220 (inside _decode_dataclass) there is an opportunity to check whether the field includes a default_factory and pass it into _decode_generic, or call an alternative function that will respect the default_factory.

I am not familiar with the codebase, standards etc but following some discussion and decision on approach, I can possibly contribute a fix...

Python version you are using

Python 3.10.9

Environment description

dataclasses-json==0.6.3 marshmallow==3.20.1 mypy-extensions==1.0.0 packaging==23.2 typing-inspect==0.9.0 typing_extensions==4.9.0