Open albertino87 opened 1 year ago
You could consider the following approach:
from dataclasses import astuple, dataclass, fields
from cattrs import Converter
c = Converter()
def get_converter() -> Converter:
return c
@dataclass
class MyDataClass:
a: str
b: int
def __post_init__(self):
self.validate_types()
def validate_types(self) -> None:
c.structure(astuple(self), tuple[tuple(f.type for f in fields(self.__class__))])
@classmethod
def from_dict(
cls, data: dict, converter: Converter = get_converter()
) -> "MyDataClass":
return converter.structure(data, cls)
MyDataClass(a="a", b="a") # Will raise since `b` isn't an int
The idea is you run the structuring path, but not for your class but for a tuple[str, int]
.
Note that this approach will be somewhat slow, especially if you use MyDataClass.from_dict
(it will actually validate twice, once when structuring in from_dict
and once in validate_types
). Maybe that's ok for your use case!
i guess this could also work :), tanks!
You could consider the following approach:
from dataclasses import astuple, dataclass, fields from cattrs import Converter c = Converter() def get_converter() -> Converter: return c @dataclass class MyDataClass: a: str b: int def __post_init__(self): self.validate_types() def validate_types(self) -> None: c.structure(astuple(self), tuple[tuple(f.type for f in fields(self.__class__))]) @classmethod def from_dict( cls, data: dict, converter: Converter = get_converter() ) -> "MyDataClass": return converter.structure(data, cls) MyDataClass(a="a", b="a") # Will raise since `b` isn't an int
The idea is you run the structuring path, but not for your class but for a
tuple[str, int]
.Note that this approach will be somewhat slow, especially if you use
MyDataClass.from_dict
(it will actually validate twice, once when structuring infrom_dict
and once invalidate_types
). Maybe that's ok for your use case!
Unfortunately this method doesn't work when you have such dataclasses nested one inside the other all with:
@dataclass
class GenericDataClass:
def __post_init__(self):
self.validate_types()
def validate_types(self) -> None:
c.structure(astuple(self), tuple[tuple(f.type for f in fields(self.__class__))])
I guess my example from before was too simplistic :(
Your GenericDataClass
has no fields, I assume it should have a field of type DataClass
?
This is getting a little tricky. We can set up a separate converter with unstruct_strat=UnstructureStrategy.AS_TUPLE
(that means it'll un/structure classes to tuples instead of dictionaries) and use that for this validation.
from dataclasses import astuple, dataclass, fields
from cattrs import Converter, UnstructureStrategy
c = Converter()
tuple_c = Converter(unstruct_strat=UnstructureStrategy.AS_TUPLE)
def get_converter() -> Converter:
return c
@dataclass
class MyDataClass:
a: str
b: int
def __post_init__(self):
self.validate_types()
def validate_types(self) -> None:
tuple_c.structure(
astuple(self), tuple[tuple(f.type for f in fields(self.__class__))]
)
@classmethod
def from_dict(
cls, data: dict, converter: Converter = get_converter()
) -> "MyDataClass":
return converter.structure(data, cls)
@dataclass
class GenericDataClass:
a: MyDataClass
def __post_init__(self):
self.validate_types()
def validate_types(self) -> None:
tuple_c.structure(
astuple(self), tuple[tuple(f.type for f in fields(self.__class__))]
)
GenericDataClass(MyDataClass("a", 1))
ok, sorry for the incompleteness, the full example would be something like this (with whatever works in validate types):
from abc import ABC
from dataclasses import dataclass, astuple, fields
from cattrs import Converter
DEFAULT_STRUCTURE_HOOKS = {
}
def get_converter(
custom_hooks: StructureHookMap | None = None
) -> Converter:
if custom_hooks is None:
custom_hooks = {}
structure_hooks = DEFAULT_STRUCTURE_HOOKS | custom_hooks
converter = Converter(forbid_extra_keys=True)
for hook_type, hook in structure_hooks.items():
converter.register_structure_hook(hook_type, hook)
return converter
class Entity(ABC):
def __init_subclass__(cls, frozen=True):
return dataclass(frozen=frozen)(cls)
def __post_init__(self):
self.validate_types()
def validate_types(self, converter: Converter = get_converter()) -> None:
converter.structure(astuple(self), tuple[tuple(f.type for f in fields(self.__class__))])
@classmethod
def from_dict(
cls, data: dict, converter: Converter = get_converter()
):
return converter.structure(data, cls)
class Child(Entity):
beta: float
population_name: str
class Parent(Entity):
failure_mode_name: str
project: str
populations: list[Child]
id: str
valid_entity = Parent(
id="123456789",
failure_mode_name="failure mode A",
project="Myproj",
populations=[
Child(beta=1.0, population_name="pop A"),
Child(beta=2.0, population_name="pop B"),
],
)
I know that with from_dict it will validate the types twice but for now it's ok
but your proposal seems to be working if i change get_converter to:
def get_converter(
custom_hooks: StructureHookMap | None = None
) -> tuple[Converter, Converter]:
if custom_hooks is None:
custom_hooks = {}
structure_hooks = DEFAULT_STRUCTURE_HOOKS | custom_hooks
converter1 = Converter()
converter2 = Converter(unstruct_strat=UnstructureStrategy.AS_TUPLE)
for hook_type, hook in structure_hooks.items():
converter1.register_structure_hook(hook_type, hook)
converter2.register_structure_hook(hook_type, hook)
return converter1, converter2
and call the respective converters in from_dict and validate_types :)
however it would be nice to have something directly from cattrs :)
Hm, what you're really asking for is runtime validation of types in __init__
, right? Maybe a way for a cattrs converter to wrap __init__
and apply checks?
i think that what i'd like to have is already built in cattrs since it structures the data and while doing so it checks the typing. what i'd like to have is just the type check something like:
converter.type_check(data, dataclass/tuple[...]/list[....]/or any other python structure)
the thing is that the method you proposed i think might fail if the nested dictionaries in data have an order of the keys different from the one of the nested dataclasses in the structure. it doesn't really affect me in this case since the data comes from asdict(self) so the order will be the same
Description
I have a dataclass that can be instantiated (structured) with a from_dict method, in this way I know all the typings are correct:
It also has a validate_types method that can be run after instantiation
What I Did
I would like to add the validate_types in the post_init:
unfortunately this ends in an infinite loop, because the from_dict generates a new instance of the class and the code goes into the post_init which calls validate_types, which calls from_dict, which creates a new instance of the class which goes into the post_init and on and on.
would it be possible for you to expose a function/method of the converter that only checks if the data can be structured without actually structuring/creating the new instance?