lidatong / dataclasses-json

Easily serialize Data Classes to and from JSON
MIT License
1.34k stars 151 forks source link

Dataclasses containing variables that reference themselves in list/dict fail to re-create as original type #459

Closed NiroHaim closed 9 months ago

NiroHaim commented 11 months ago

Description

When a dataclass contains a reference to a list/dict of itself in a variable type, converting an object to dict and from dict back to json results in a dictionary in the inner self-typed field instead of the self type.

Code snippet that reproduces the issue

from dataclasses import dataclass
from dataclasses_json import dataclass_json

@dataclass_json
@dataclass
class SpecialLinkedList:
    val: ...
    nexts: list['SpecialLinkedList'] = None

my_list = SpecialLinkedList(val=1, nexts=[SpecialLinkedList(val=2)])

print(my_list == SpecialLinkedList.from_dict(my_list.to_dict()))  # False

Describe the results you expected

The code snippet above outputs False. values:

SpecialLinkedList.from_dict(my_list.to_dict()) == SpecialLinkedList(val=1, nexts={'1': {'val': 2, 'nexts': None}})
my_list == SpecialLinkedList(val=1, nexts={'1': SpecialLinkedList(val=2, nexts=None)})

Python version you are using

3.10

Environment description

clean project, only dataclass and dataclasses-json.

george-zubrienko commented 11 months ago

When I do

SpecialLinkedList.from_dict(my_list.to_dict())

/lib/python3.11/site-packages/dataclasses_json/core.py:184: RuntimeWarning: `NoneType` object value of non-optional type nexts detected when decoding SpecialLinkedList.
  warnings.warn(

SpecialLinkedList(val=1, nexts=[SpecialLinkedList(val=2, nexts=None)])

Which is correct? In your snippet, you are comparing class references, which will always output false since those are different instances.

NiroHaim commented 11 months ago

Which is correct? In your snippet, you are comparing class references, which will always output false since those are different instances.

Yeah, of course. In my example its more of a pseudo code comparison. If you look at the result value of SpecialLinkedList.from_dict(my_list.to_dict()) you'll see that the nexts attribute points to a dict instead of a SpecialLinkedList as expected and as type hinted in the SpecialLinkedList class. Thats the problem. the inner self reference is not being converted to self, rather it remains a dict.

NiroHaim commented 11 months ago

On a second look at your output it seems the bug is not reproducing, do you actually get nexts as a list of SpecialLinkedLists? Just tried it again and the bug still replicates for me.

EDIT I just created another clean environment to test this out. for reference I'm using python3.10 This if my pip3 freeze output:

dataclasses-json==0.5.14
marshmallow==3.20.1
mypy-extensions==1.0.0
packaging==23.1
typing-inspect==0.9.0
typing_extensions==4.7.1

The bug replicates

george-zubrienko commented 11 months ago

Interesting! I tested on 3.11. I will re-test using your env as described and circle back here.

NiroHaim commented 10 months ago

Interesting! I tested on 3.11. I will re-test using your env as described and circle back here.

Hey! 😄 any updates?

george-zubrienko commented 10 months ago

hi @NiroHaim Sorry we have a bit of backlog, but I have this on my list and will look into it hopefully this week, worst case next week :)

nirh-cye commented 9 months ago

Hey! Any news on this?

george-zubrienko commented 9 months ago

hi @NiroHaimo not yet, but it is on the todo list. Sorry to keep you waiting, but all team members are a bit swamped past 2 months with both internal and OSS contributions, plus our ability to release to PyPI is severly impaired until Github fixes env protection in October. My current expectation is I'll be able to send PR/identify the issue, but the fix will see actual release around October :(

george-zubrienko commented 9 months ago

So confirmed on 3.10 behaviour is different:

from dataclasses import dataclass
from dataclasses_json import dataclass_json
@dataclass_json
@dataclass
class SpecialLinkedList:
    val: int
    nexts: list['SpecialLinkedList'] = None
my_list = SpecialLinkedList(val=1, nexts=[SpecialLinkedList(val=2)])

print(my_list)

# SpecialLinkedList(val=1, nexts=[SpecialLinkedList(val=2, nexts=None)])
sys.version_info
# sys.version_info(major=3, minor=10, micro=12, releaselevel='final', serial=0)

SpecialLinkedList.from_dict(my_list.to_dict())
# SpecialLinkedList(val=1, nexts=[{'val': 2, 'nexts': None}])
george-zubrienko commented 9 months ago

Issue is that in 3.10 self-reference hint is a string, lol, which causes this method to fail

def _decode_items(type_args, xs, infer_missing):
    """
    This is a tricky situation where we need to check both the annotated
    type info (which is usually a type from `typing`) and check the
    value's type directly using `type()`.

    If the type_arg is a generic we can use the annotated type, but if the
    type_arg is a typevar we need to extract the reified type information
    hence the check of `is_dataclass(vs)`
    """
    def _decode_item(type_arg, x):
        if is_dataclass(type_arg) or is_dataclass(xs):
            return _decode_dataclass(type_arg, x, infer_missing)
        if _is_supported_generic(type_arg):
            return _decode_generic(type_arg, x, infer_missing)
        return x

    if _isinstance_safe(type_args, Collection) and not _issubclass_safe(type_args, Enum):
        return list(_decode_item(type_arg, x) for type_arg, x in zip(type_args, xs))
    return list(_decode_item(type_args, x) for x in xs)
george-zubrienko commented 9 months ago

This is the reason https://peps.python.org/pep-0673/ - in 3.11 they finally added proper self type

george-zubrienko commented 9 months ago

Linked a PR to fix this, will finalize a bit later

george-zubrienko commented 9 months ago

@NiroHaim please take a look at the linked PR - should fix this issue