konradhalas / dacite

Simple creation of data classes from dictionaries.
MIT License
1.72k stars 106 forks source link

Type hooks applied to deserialized values when hook for union type is defined #229

Open Feuermurmel opened 1 year ago

Feuermurmel commented 1 year ago

Describe the bug

It seems that custom time hooks are applied to deserialized values when hooks for union types are defined. I think this is a bug.

To Reproduce

The following is a contrived example demonstrating the behavior, the code base I'm observing this in is more complex. In the example, it probably would work to simply omit the type hook for Union[Foo, Bar], but that won't work in the code base I think.

I'm declaring two types Foo and Bar with custom type hooks to get the following mapping:

For the type Union[Foo, Bar], I'm declaring another custom type hook that looks for a # to decide on the type to return.

from typing import Union
from dataclasses import dataclass
import dacite

@dataclass
class Foo:
    id: int

@dataclass
class Bar:
    name: str

@dataclass
class X:
    foo: Foo
    bar: Bar
    foo_or_bar: Union[Foo, Bar]

def _read_foo(data):
    print(f'_read_foo(): {data}')

    return Foo(int(data[1:]))

def _read_bar(data):
    print(f'_read_bar(): {data}')

    return Bar(data)

def _read_foo_or_bar(data):
    print(f'_read_foo_or_bar(): {data}')

    if data and data[0] == '#':
        return Foo(int(data[1:]))
    else:
        return Bar(data)

_dacite_type_hooks = {
    Foo: _read_foo,
    Bar: _read_bar,
    Union[Foo, Bar]: _read_foo_or_bar}

dacite_config = dacite.Config(type_hooks=_dacite_type_hooks)

data = {
    'foo': '#123',
    'bar': 'hello',
    'foo_or_bar': '#123'
}

print(dacite.from_dict(X, data, dacite_config))

When running this, I get the following output:

_read_foo(): #123
_read_bar(): hello
_read_foo_or_bar(): #123
_read_foo(): Foo(id=123)
_read_bar(): Foo(id=123)
X(foo=Foo(id=123), bar=Bar(name='hello'), foo_or_bar=Bar(name=Foo(id=123)))

As you can see, after _read_foo_or_bar() has deserialized the value, the value is passed to both _read_foo() and _read_bar(), leading to result Bar(name=Foo(id=123)), which doesn't match the declared types and is surprising. This looks like a bug to me.

Expected behavior

IMHO, after applying the type hook _read_foo_or_bar(), the value should be used as-is and not processed further. I.e. I would expect the following output:

_read_foo(): #123
_read_bar(): hello
_read_foo_or_bar(): #123
X(foo=Foo(id=123), bar=Bar(name='hello'), foo_or_bar=Foo(id=123))

Environment

$ pip list
Package    Version
---------- -------
dacite     1.8.0
pip        23.0.1
setuptools 67.3.2
$ python -V
Python 3.8.16