Closed dhirschfeld closed 3 months ago
Without providing a comprehensive answer, I'll just say that I have a system implemented very similar to what you are proposing. I actually have one converter which requires the type tags (even without unions) and another that only puts them in unions.
You need to register structure/unstructure hooks for the union you want to require/inject type tags for. You can use the cattrs strategy for handling unions to do this for you (see the docs) or you can then write some code in that, that reads the type tag and dispatches to the correct type to structure it as.
I will say that in practice writing hooks for every possible union is really annoying and not practical, so I developed a hook factory that automatically handles unions over certain types, e.g. attrs classes and any standard builtin type. For types that have specialized conversion (like dates <-> timestamps) you need some extra support for those.
I will say that it is quite a bit of work to write these factories.
I couldn't figure out any way to get cattrs
to do this so just wrote my own recursive decode
function:
def decode(arg: Any) -> Any:
match arg:
case {"_type": cls, "value": value}:
return decoder[cls](decode(value))
case arg if isinstance(arg, abc.Mapping):
return {
key: decode(value) if isinstance(value, abc.Mapping) else value
for key, value
in arg.items()
}
case list():
return [
decode(value) if isinstance(value, abc.Mapping) else value
for value in arg
]
case tuple():
return tuple(
decode(value) if isinstance(value, abc.Mapping) else value
for value in arg
)
case arg if isinstance(arg, abc.Iterable) and not isinstance(arg, (str, bytes)):
return (
decode(value) if isinstance(value, abc.Mapping) else value
for value in arg
)
case _:
return arg
>>> decode(decoded)
{'seed': 42,
'method': 'test',
'effective_date': datetime.datetime(2024, 8, 20, 8, 51, 16, 707612),
'mytuple': MyTuple(values=(1, 2, 3))}
Description
I'm trying to understand if
cattrs
is a good fit for my problem. I think it might be, but I'm not sure how to actually use it so would greatly appreciate some advice from an expert!I have a python dictionary which represents partially deserialized data - e.g.
i.e. primitive types have been deserialized into the corresponding Python objects but complex types are embedded in the output, represented by a tagged dict -
{"_type": str, "value": Any}
Details
```python >>> import json >>> from dataclasses import dataclass >>> from datetime import datetime >>> import cattrs >>> >>> @dataclass ... class MyTuple: ... values: tuple[int] ... >>> kwargs = dict(seed=42, method='test', effective_date=datetime.now(), mytuple=MyTuple(values=(1,2,3))) >>> kwargs {'seed': 42, 'method': 'test', 'effective_date': datetime.datetime(2024, 8, 1, 23, 18, 45, 674566), 'mytuple': MyTuple(values=(1, 2, 3))} >>> def encode_datetime(obj: datetime) -> dict: ... return dict(_type='datetime', value=obj.isoformat()) ... >>> def encode_mytuple(obj: MyTuple) -> dict: ... return dict(_type='MyTuple', value=obj.values) ... >>> def default(obj): ... if isinstance(obj, datetime): ... return encode_datetime(obj) ... elif isinstance(obj, MyTuple): ... return encode_mytuple(obj) ... else: ... return obj ... >>> encoded = json.dumps(kwargs, default=default) >>> encoded '{"seed": 42, "method": "test", "effective_date": {"_type": "datetime", "value": "2024-08-01T23:18:45.674566"}, "mytuple": {"_type": "MyTuple", "value": [1, 2, 3]}}' >>> decoded = json.loads(encoded) >>> decoded {'seed': 42, 'method': 'test', 'effective_date': {'_type': 'datetime', 'value': '2024-08-01T23:18:45.674566'}, 'mytuple': {'_type': 'MyTuple', 'value': [1, 2, 3]}} ```I think it should be able to use
cattrs
to convert the tagged dicts back into the Python objects they represent but I'm not sure how to define the structure hook to do so?Defining the decoders is straightforward:
...but how can I tell
cattrs
to e.g. applydecode_mytuple
when it encounters a nested dict with a_type
key which equalsMyTuple
?