python-attrs / cattrs

Composable custom class converters for attrs, dataclasses and friends.
https://catt.rs
MIT License
779 stars 108 forks source link

cattrs effectively disables attrs validation #518

Closed znicholls closed 3 months ago

znicholls commented 4 months ago

Description

I was using cattrs to serialise some classes. I put some incorrect data in, and rather than an error being raised, the incorrect data was silently passed. It is easiest to illustrate with an example, see below.

What I Did

Fully reproducible example

```python """ Example of cattrs disabling attrs validation """ import traceback from typing import Optional import attr.validators from attrs import define, field from cattrs.preconf.pyyaml import make_converter converter = make_converter() @define class Example: """ Example with an optional field that must be a string """ name: Optional[str] = field( default=None, validator=attr.validators.optional(attr.validators.instance_of(str)), ) print("Standard initialisation via attrs works") res = Example("a string here") # No exception print(f"Intialised with string name: {res=}") print() print("If we try to instantiate with attrs, we get the errors we expect") print() print("Initialising with int") try: Example(1) except: print("Exception raised when using attrs directly") traceback.print_exc() print() print("Initialising with list") try: Example([1, 2, 3]) except: print("Exception raised when using attrs directly") traceback.print_exc() print() print("cattrs structures correctly when the input is a string") res = converter.structure( {"name": "a string"}, Example, ) print(f"Structured with cattrs with string name: {res=}") print() print( "However, if we structure with cattrs with int input, no error is raised. " "Instead the int is simply converted to a string." ) try: res = converter.structure( {"name": 12}, Example, ) print(f"No error raised. {res=}") except: print("Exception raised when using cattrs directly") traceback.print_exc() print() print("Similar issue to above with list input") try: res = converter.structure( {"name": ["a", "list"]}, Example, ) print(f"No error raised. {res=}") except: print("Exception raised when using cattrs directly") traceback.print_exc() print() print(res) print( "Same issue arises if we load from string rather than structuring " "(although it is much harder to know that the 12 isn't a string in this case)" ) try: res = converter.loads( "name: 12", Example, ) print(f"No error raised. {res=}") except: print("Exception raised when using cattrs directly") traceback.print_exc() print() print("Same idea as above, except for a list") try: res = converter.loads( """name: - name - other""", Example, ) print(f"No error raised. {res=}") except: print("Exception raised when using cattrs directly") traceback.print_exc() print() res = converter.loads("""name: name""", Example) ```

There is a workaround, basically you have to put a noop as the converter for the attribute and tell the cattrs converter to prefer attrs converters. However, this feels like a non-obvious hack to me. Having said that, I can completely appreciate why this behaviour occurs and there isn't an obvious solution (at least not to me).

Workaround

```python """ Example of workaround for cattrs disabling attrs validation """ import traceback from typing import Optional import attr.validators from attrs import define, field from cattrs.preconf.pyyaml import make_converter converter = make_converter( detailed_validation=False, forbid_extra_keys=True, prefer_attrib_converters=True ) def no_conversion(inp): """ No conversion, returns what it gets """ return inp @define class Example: """ Example with an optional field that must be a string """ name: Optional[str] = field( default=None, converter=no_conversion, validator=attr.validators.optional(attr.validators.instance_of(str)), ) print("Standard initialisation via attrs works") res = Example("a string here") # No exception print(f"Intialised with string name: {res=}") print() print("If we try to instantiate with attrs, we get the errors we expect") print() print("Initialising with int") try: Example(1) except: print("Exception raised when using attrs directly") traceback.print_exc() print() print("Initialising with list") try: Example([1, 2, 3]) except: print("Exception raised when using attrs directly") traceback.print_exc() print() print("cattrs structures correctly when the input is a string") res = converter.structure( {"name": "a string"}, Example, ) print(f"Structured with cattrs with string name: {res=}") print() print("With the workaround, now an error is raised with cattrs") try: res = converter.structure( {"name": 12}, Example, ) print(f"No error raised. {res=}") except: print("Exception raised when using cattrs directly") traceback.print_exc() print() print("Also with list input") try: res = converter.structure( {"name": ["a", "list"]}, Example, ) print(f"No error raised. {res=}") except: print("Exception raised when using cattrs directly") traceback.print_exc() print() print(res) print("The workaround works for :meth:`loads` too") try: res = converter.loads( "name: 12", Example, ) print(f"No error raised. {res=}") except: print("Exception raised when using cattrs directly") traceback.print_exc() print() print("Same idea as above, except for a list") try: res = converter.loads( """name: - name - other""", Example, ) print(f"No error raised. {res=}") except: print("Exception raised when using cattrs directly") traceback.print_exc() print() res = converter.loads("""name: name""", Example) ```

I would be interested in a maintainer's thoughts about this, particularly whether it is a bug, or expected. If expected, I am happy to contribute docs about this edge case and how to workaround it either here or in attrs (it requires changes on both sides so the best location isn't super clear to me either).

Thanks for a great project too, by the way, such a great initiative!

Tinche commented 4 months ago

Hello!

So let me explain what exactly is happening here.

When you structure an attrs class using cattrs, cattrs looks at the type annotations. Then, cattrs uses its internal machinery to find and use hooks associated to those types.

Now, for strings (and ints, floats, bools...), the default structuring hook is just str. This is a good default for reasons of performance, compatibility (for example, large ints are commonly stringified in JSON to avoid precision loss) and simplicity.

So when cattrs sees 1 (int), it'll do str(1), end up with '1' (note the quotes, it's a string now) and pass that along to attrs. The attrs validator will still run (cattrs doesn't stop this), it just won't complain.

I imagine you were expecting a different default hook for strings, something like what you have in your attrs validator. You can override this fairly easily:

c = Converter()

def validate_str(value: Any) -> str:
    if not isinstance(value, str):
        raise ValueError(f"{value} is not a string!")
    return input

c.register_structure_hook(str, validate_string)

Now all the strings will get validated instead of coerced.

Another workaround you already mentioned, which is to not bother with strings and just attrs handle it. We optimize for the case of no runtime validation in the code (you'd rely on a typechecker like Mypy), and doing runtime validation only at the edges (when data enters your system).

We already have some docs for this here: https://catt.rs/en/latest/defaulthooks.html#int-float-str-bytes, but these are new docs, so under latest and not stable ;)

Let me know your thoughts!

znicholls commented 3 months ago

We optimize for the case of no runtime validation in the code (you'd rely on a typechecker like Mypy), and doing runtime validation only at the edges (when data enters your system).

Ok got it, all makes sense.

Docs are very helpful and explain the situation.

Happy to close this unless there is something else you want to do as follow up. Thanks for the quick and thorough reply!

Tinche commented 3 months ago

Let me know if you have other questions!