Closed agausmann closed 3 months ago
As you've noticed, cattrs will deal with the field before it reaches attrs and attrs will apply the converter. This can't be changed for a number of reasons like backwards-compatibility and the fact it interferes with other uses of attrs converters.
The immediate fix would be to make a converter with prefer_attrib_converters=True
and use it.
c = cattrs.Converter(prefer_attrib_converters=True)
print(Foo(2.0))
print(Foo("None"))
print(c.structure({"a": 2.0}, Foo))
print(c.structure({"a": "None"}, Foo))
But I think a better approach would be to remove the attrs converter and have cattrs deal with this. That way the rest of your code base can be isolated from the details of data structuring, which is one of the points of cattrs.
Like this:
import attrs
import cattrs
from cattrs.gen import make_dict_structure_fn
def converter(x: str | float | None, _) -> float | None:
if x is None or x == "None":
return None
return float(x)
@attrs.define
class Foo:
a: float | None
c = cattrs.Converter()
c.register_structure_hook(
Foo, make_dict_structure_fn(Foo, c, a=cattrs.override(struct_hook=converter))
)
print(c.structure({"a": 2.0}, Foo))
print(c.structure({"a": "None"}, Foo))
Let me know what you think! I'm going to close this in the meantime.
Thank you very much for your insight!
I agree that it's better to isolate the nuances of the data format, and a custom Converter is the right tool for that. I was just coming at the problem from the wrong angle.
Another solution using newtypes: you can register the hook just once for that newtype, instead of for every field in every class:
ParsedFloat = typing.NewType("ParsedFloat", float)
@attrs.define
class Foo:
a: ParsedFloat | None = None
def converter(x: str | float | None, _) -> ParsedFloat | None:
if x is None or x == "None":
return None
return ParsedFloat(float(x))
c = cattrs.Converter()
c.register_structure_hook(ParsedFloat, converter)
The signature of the converter is a little unorthodox, and this still leaks parsing information into the class definition. In this case I think it's a worthy tradeoff:
I am parsing a complex structure with a lot of classes and a lot of fields to clean up in this way. The newtype approach is more maintainable compared to listing the classes+fields in two different places (the class definition and converter creation)
The classes (Foo
in this example) are specifically only used for ingesting data, it's not used by the main application. The output will be aggregated with data from other sources in a separate data structure that is used by the application.
Description
I want to structure a type with a
float | None
field where the unstructured data may be float, None, or string (either an unparsed float or the keyword"None"
The default conversion provided by cattrs would handle the unparsed float, but it will not handle the keyword, so it is not helpful here. However, attrs has aconverter
option for field definitions which seems to be applicable.When using the class constructor with a field converter set, the value gets passed to the converter function with no issue. However, when using cattrs, cattrs seems to still try the default conversion with the unstructured data, which causes an exception when it encounters the keyword.
I expected cattrs to not do this and just pass through the unstructured value in the case where there is a user-provided converter function.
What I Did