konradhalas / dacite

Simple creation of data classes from dictionaries.
MIT License
1.72k stars 107 forks source link

Feature proposal: add support for explicit union selection #105

Open eddawley opened 4 years ago

eddawley commented 4 years ago

We use dacite pretty heavily for some very complex dataclass hierarchies and the one pain point we keep running into is the potentially indeterministic Union selection. The naive approach is very good for what it is but fundamentally will never work when multiple options have the same interface. For the corresponding marshmallow schemas we use marshmallow-polyfield to explicitly define polymorphism based on another attribute so I'm hoping we could add a similar feature in dacite.

Proposed API:

Add a new explicit_unions option to dacite.Config which is a Dict[str,callable] for attribute resolution. This will allow for explicit union matching without any modification to your dataclasses.

For larger hierarchies, setting config options at lower levels would be much more usable. Since dataclasses specifically ignore class variables,, add a class variable for a custom dacite config for that class. Dacite would override any parent config with the current class's.

Thus, explicit union handling would look like this:

@dataclass
class Foo:
  a: Union[A,B]
  a_type: str
  dacite_config: dacite.Config = dacite.Config(explicit_unions={"a" : lambda data: return A if data["a_type"] == "a" else B})

I wanted to open a discussion here before opening a PR as the API could be very opinionated. Thank you for a great project.

konradhalas commented 4 years ago

Hi @eddawley - thank you for sharing this very interesting idea.

So as I understand you are proposing 2 new features:

1.explicit_unions

  1. dataclass-level configuration

Let's start with the first one. Is it possible to use Literal in your case? You can use it in the following way:

from dataclasses import dataclass
from typing import Literal, Union

import dacite

@dataclass
class A:
    t: Literal["a"]

@dataclass
class B:
    t: Literal["b"]

@dataclass
class C:
    u: Union[A, B]

print(dacite.from_dict(C, {"u": {"t": "a"}}))  # C(u=A(t='a'))
print(dacite.from_dict(C, {"u": {"t": "b"}}))  # C(u=B(t='b'))
eddawley commented 4 years ago

You are correct that this is actually 2 new features. I realized that after I submitted. Sorry for any confusion.

As for using Literal, that only works if the relationship is defined on a child's attribute. With marshmallow-polyfield you define the relationship on an attribute(s) in the parent.

Here's a simple example explaining the difference:

@dataclass
class Person:
  pet_type: Literal["cat", "dog"]
  pet: Union[Cat, Dog]

@dataclass
class Cat:
  name: str
  greeting: str = None

@dataclass
class Dog:
   name: str
  greeting: str = None

vs

@dataclass
class Person:
  pet: Union[Cat, Dog]

@dataclass
class Cat:
  name: str
  pet_type: Literal["cat"]
  greeting: str = None

@dataclass
class Dog:
  name: str
  pet_type: Literal["dog"]
  greeting: str = None

When no greeting is supplied, there is no way for dacite to get an instantiation error in the former. Thus it will accept the first option every time.

The latter might work for some cases but things like sqlalchemy polymorphic require the former.