python-attrs / cattrs

Composable custom class converters for attrs, dataclasses and friends.
https://catt.rs
MIT License
779 stars 108 forks source link

TYPE_CHECKING and init=False #531

Open AdrianSosic opened 2 months ago

AdrianSosic commented 2 months ago

Description

Hi, I think I'm facing a similar situation as described in #160 but with a few subtleties that potentially motivate refining the internal type resolution of cattrs.

What I Did

Consider the following two files:

# inner.py
# --------

from attrs import define

@define
class Inner:
    pass
# outer.py
# --------

from typing import TYPE_CHECKING, Optional

import cattrs
from attrs import define, field

if TYPE_CHECKING:
    from inner import Inner

@define
class Outer:
    inner: Optional[Inner] = field(init=False, default=None)

outer = Outer()
cattrs.unstructure(outer)

Running the latter gives NameError: name 'Inner' is not defined because the inner class is not available in outer.py.

While one could fix this by informing cattrs about the class before attempting to unstructure, I don't think it's a convenient solution in my case because I don't even see a reason why it should be necessary in the first place: the corresponding attribute is init=False and with the default settings where such fields are ignored, cattrs shouldn't even need to bother about the field at all ... assuming of course that I do not overlook some other reason.

The motivation for my example: in my case, Inner relies on some heavy external dependencies (such as torch) and many execution paths of my code don't actually require loading it, so I instead use lazy-loading for improved startup time. That said, I wouldn't want to load it just to satisfy cattrs to perform an unstructuring operation that actually does not require the class due to init=False.

For now, I can simply remove that type annotation, which solves the problem, but this is of course suboptimal. Any thoughts would be highly appreciated.

Tinche commented 2 months ago

I assume you're using from __future__ import annotations?

cattrs calls attrs.resolve_types on every class it sees, which calls into typing.get_type_hints which explodes. I don't think there's an easy way for us to be selective about which fields we want resolved, unfortunately. cattrs only wants init=True fields in this case, but I don't think there's an easy way to communicate that to attrs.

AdrianSosic commented 2 months ago

Hi @Tinche, as always: thanks for connecting so quickly 🙃 🥇

I assume you're using from __future__ import annotations?

Yes, absolutely. Somehow got lost when copy-pasting.

cattrs calls attrs.resolve_types on every class it sees, which calls into typing.get_type_hints which explodes. I don't think there's an easy way for us to be selective about which fields we want resolved, unfortunately. cattrs only wants init=True fields in this case, but I don't think there's an easy way to communicate that to attrs.

The sequence of calls sounds logical and is what I would have expected to happen in the back. However – and perhaps this is a stupid question because I don't know the cattrs internals in depth – why is this an attrs issue? In my naive opinion: couldn't this problem be avoided by "simply" calling attrs.resolve_types only on those types that actually need to be resolved? That is, for any class in a given code, it's either "class needs to be serialized -> cattrs/converter needs to bother" or "class needs no serialization, e.g. because it only appears with init=False -> cattrs/converter can ignore it". So if resolve_types was only called on the former, the problem wouldn't arise, right?

Tinche commented 2 months ago

Here's the actual piece of code that calls attrs.resolve_types: https://github.com/python-attrs/cattrs/blob/898e59cc4076328b985d67ae989bc0f96578a9b5/src/cattrs/gen/__init__.py#L102. So we don't do it always, we only do it if there is a string annotation in one of the fields.

To help with clarity, let's distinguish between classes and fields. We call attrs.resolve_types on classes, which resolves their fields so we can use them.

So in this case, cattrs will call attrs.resolve_types(Outer) because that will ensure the following call of attrs.fields(Outer) will return the proper metadata (that's just how attrs introspection works in general).

Now, we could make this condition a little more sophisticated and account for init=False fields, but in your example, presumably Outer would have some other attributes you would want serialized. Those annotations would need to be de-stringified to work. And we can't call resolve_types and tell it to just resolve init fields.

AdrianSosic commented 2 months ago

Ah ok, now I get it, thanks for the clarification 👍🏼 So it is an attrs issue, indeed ... Given that cattrs is the companion package of attrs, do you think there is a chance to tackle it from that side to enable such use cases for cattrs? After all, I've understood that both packages deeply care about performance and this really is a performance-related issue ;)

AdrianSosic commented 2 months ago

Just had a quick look at the source code of resolve_types and noticed that it takes an optional attribs parameter that in fact let's control which fields to loop over: https://github.com/python-attrs/attrs/blob/5ce5d8278f162ec7542a251091427fd88e538554/src/attr/_funcs.py#L463C9-L463C66 In principle, that would offer an easy solution, no? So instead of calling resolve_types with the class only, one could filter down to init=True attributes only and thus solve it from the cattrs side 🤔