python-attrs / cattrs

Composable custom class converters for attrs, dataclasses and friends.
https://catt.rs
MIT License
818 stars 113 forks source link

ForwardRef and GenConverter do not seem to work together #201

Open aha79 opened 2 years ago

aha79 commented 2 years ago

Description

When using recursive types (see example below), structure with GenConverter raises a StructureHandlerNotFoundError with the message "Unsupported type: ForwardRef('Y'). Register a structure hook for it" . The error does not happen with PEP 563 enabled. That means when we add 'from future import annotations' to the program and replace y: typing.Optional['Y'] with y: typing.Optional[Y] everything works as expected.

What I Did

import typing
import cattr
import attr

@attr.define(auto_attribs=True)
class X:
    a: int
    b: float
    y: typing.Optional['Y']   #with PEP563 enabled:   y: typing.Optional[Y]   (that does not raise an exception) 

@attr.define(auto_attribs=True)
class Y:
    a: typing.List[int]
    x: X

converter = cattr.GenConverter()

data = { 'a': [1,2,3], 'x': {'a':1, 'b':2.000001, 'y': {'a': [], 'x': {'a': 5, 'b': 3.1, 'y': None}}} }
d = converter.structure( data, Y )  # <---- raises StructureHandlerNotFoundError    Unsupported type: ForwardRef('Y'). Register a structure hook for it
Tinche commented 2 years ago

Just FYI you don't need auto_attribs=True when using define, that's the default behavior :)

Tinche commented 2 years ago

I was bitten by this lately too. Wrap the entire type in quotes, instead of just 'Y':

import typing

import attr

import cattr

@attr.define
class X:
    a: int
    b: float
    y: "typing.Optional[Y]"

@attr.define
class Y:
    a: typing.List[int]
    x: X

converter = cattr.GenConverter()

data = {
    "a": [1, 2, 3],
    "x": {
        "a": 1,
        "b": 2.000001,
        "y": {"a": [], "x": {"a": 5, "b": 3.1, "y": None}},
    },
}
d = converter.structure(data, Y)

It has to do with how typing.get_type_hints works.

aha79 commented 2 years ago

Okay, I see. I confirm, this works.

However now I try to get this one-line recursive type

YY = typing.List[ typing.Tuple[typing.Optional[YY],int] ]

to work.

Unfortunately, this generates a "NameError" even if PEP563 is enabled (I know this is not a cattrs issue. perhaps this is not supported by PEP563).

But with PEP563 disabled, putting quotes around it completely

YY = "typing.List[ typing.Tuple[typing.Optional[YY],int] ]" 

gives a StructureHandlerNotFoundError error in structure.

data2 = [ (None, 1), (None, 2), ([(None, 11), ([],22)], 3) ]  
d2 = converter.structure( data2, YY )   # <--- raises StructureHandlerNotFoundError 

Also putting the quote somewhere in between, or around "YY" does not avoid the exception.

YY = typing.List[ typing.Tuple["typing.Optional[YY]",int] ]

Some background: I am writing a code generator which generates type-hinted code automatically. So I need a strategy to generate the type-hints which always works.

I am not really sure what the correct way to specify type "YY" is, so I cannot really tell if there is a cattr issue here or not.

However, https://stackoverflow.com/a/53845083/1739884 seems to indicate that

YY = typing.List[ typing.Tuple[typing.Optional["YY"],int] ]

should work.

Tinche commented 2 years ago

That's an interesting problem.

I don't think PEP 563 is in play here directly, that PEP deals with annotations (think a type of a class field or a function argument), you're dealing with raw types. For example, if you call typing.get_type_hints on the string "typing.List[ typing.Tuple[typing.Optional[YY],int] ]" it'll fail because it doesn't know what to do with it. If this annotation was on a field in a class, it'd be easier to handle.

The typing.List[ typing.Tuple[typing.Optional["YY"],int] ] might work for static type analyzers, but cattrs cannot turn the string "YY" into the YY type - there's not enough information at runtime.

I think in order to support this use case cattrs would need to be able to handle ForwardRefs natively. Then you'd write:

YY = typing.List[typing.Tuple[typing.Optional[typing.ForwardRef("YY")], int]]

I'm not exactly sure if it's possible though. In order to resolve the forward reference, we'd need to get a reference to the globals() dictionary of the type somehow.

Note that you can hack it yourself with:

YY = typing.List[typing.Tuple[typing.Optional[typing.ForwardRef("YY")], int]]
YY.__args__[0].__args__[0].__args__ = (YY, type(None))

but you'll essentially create a recursive loop, and you won't be able to even print out the type. Also cattrs won't be able to handle it since it can't hash it, due to the circular reference.

aha79 commented 2 years ago

My preliminary conclusions after I thought about this issue:

After thinking about this, probably the best solution is to take the cattr suggestion in the exception literally and to register a custom hook for each ForwardRef that is used. The Example looks like

YY = typing.List[ typing.Tuple[typing.Optional['YY'],int] ]

converter.register_structure_hook(typing.ForwardRef('YY'), lambda inst, cls: converter.structure(inst, YY) )

data2 = [ (None, 1), (None, 2), ([(None, 11), ([],22)], 3) ]
d2 = converter.structure( data2, YY )   # <<-- now works!
print(d2) 
# --> [(None, 1), (None, 2), ([(None, 11), ([], 22)], 3)]

Observations:

Do you agree to this?

Tinche commented 2 years ago

I'm looking at the source code of ForwardRef in Python 3.9.7, and assuming it's essentially the same in all other versions of Python we support.

I see ForwardRefs can be evaluated manually before use by calling ForwardRef._evaluate(globals(), locals()) on it. Since this evaluation requires the globals and locals it can't really be done by cattrs. But if we say you need to evaluate your forward references yourself before cattrs sees them, maybe we can have a generic hook for handling ForwardRefs inside cattrs. Then you wouldn't need to register a hook for every ForwardRef, you'd just need to evaluate all your ForwardRefs.

aha79 commented 2 years ago

I am not sure if we could really evaluate the ForwardRefs, as it would probably trigger the hashing issue with recursive types. So I guess we need to keep the ForwardRefs unevaluated.

Actually we have two issues, one which I did not think about when I wrote my last comment. There may be two types of same name (say YY) in two modules. Now depending in which module the ForwardRef is created, it refers to different types. This is clear, but when we cannot evaluate the Refs we need to keep the text and find another way to resolve the ambiguity. And registering hooks for the ForwardRef may generate clashes, as same object may semantically refer to different types. One solution could be to change the ForwardRef to include module Information. So not 'YY' but 'package.module.YY'. Then ForwardRefs could be resolved automatically inside cattrs, perhaps inside a generator factory hook. Also ForwardRefs would also be unique and there would not be any clashes wihen registering hooks for ForwardRefs.

I also have an idea how to automatically add the module Information to forwardrefs but that I need to try out first. I will report in the next days.

Tinche commented 2 years ago

@aha79 I think we can evaluate ForwardRefs without causing the recursive issue. An example:

import typing

from cattr import GenConverter

YY = typing.List[typing.Tuple[typing.Optional[typing.ForwardRef("YY")], int]]
YY.__args__[0].__args__[0].__args__[0]._evaluate(globals(), locals(), set())

c = GenConverter()
c.register_structure_hook_func(
    lambda t: t.__class__ is typing.ForwardRef,
    lambda v, t: c.structure(v, t.__forward_value__),
)

print(c.structure([([(None, 1)], 0)], YY))
[([(None, 1)], 0)]
aha79 commented 2 years ago

@Tinche I think this is a workable solution.

Sidenote: You do not need to explicitly construct ForwardRef, this is done automatically :-). Just write Optional['YY']

However if we accept explicitly calling ForwardRef, we could also pass the module parameter (e.g. ForwardRef('YY', module=__name__)) .

Then we can pass None to the globals and locals and it still works.

YY = typing.List[typing.Tuple[typing.Optional[typing.ForwardRef("YY", module=__name__)], int]] 
YY.__args__[0].__args__[0].__args__[0]._evaluate(None, None, set())   # <- this could be done inside of cattr

Evaluation could be done inside of cattr. No explicit evaluation by user is necessary, just the module parameter must be given!

However, there two minor drawbacks. These are:

Nevertheless I think it would be a great thing to support ForwardRefs in catts. First, those which are evaluated (in the manner you described), and, second, those which have the forwardmodule__ parameter set. Not sure how this could be done with _structuring_hook, especially since the evaluation is only done once. What do you think?

aha79 commented 2 years ago

This seems to do the job

c = cattr.GenConverter()

def _gen_forwardref_hook(type):
    type._evaluate(None, None, set())
    return lambda v, t: c.structure(v, t.__forward_value__)

c.register_structure_hook_factory(
    lambda t: t.__class__ is typing.ForwardRef,
    _gen_forwardref_hook )

The following tests assume the previous code is in different module module (so that we get NameErrors when we should).

The same results are obtained if ZZ is used as a dataclass member.

aha79 commented 2 years ago

I just found a better way:

In Python 3.10 +, cattrs could evaluate ForwardRefs without user intervention.

We would just need to require that the user must use NewType (which probably is a reasonable thing for type aliases).

The Hooks now look like this

# module.py
c = cattr.GenConverter()
def _gen_forwardref_hook(type):
    try:
        type._evaluate(None, None, set())
    except NameError as e:
        raise ValueError(f"ForwardRef({type.__forward_arg__!r}) cannot be resolved.")
    if not type.__forward_evaluated__:
        raise ValueError(f"ForwardRef({type.__forward_arg__!r}) is not resolved.")
    return lambda v, t: c.structure(v, t.__forward_value__)

c.register_structure_hook_factory(
    lambda t: t.__class__ is typing.ForwardRef,
    _gen_forwardref_hook )

def _gen_newtype_hook(type):
    if type.__module__ != 'typing': # pre 3.10 did not set the __module__ properly
        globalns = sys.modules[type.__module__].__dict__
        typing._eval_type(type.__supertype__, globalns, None)
    return lambda v, t: c.structure(v, t.__supertype__)
    #return c._structure_func.dispatch(type.__supertype__)  <-- does not work

c.register_structure_hook_factory(
    lambda t: t.__class__ is typing.NewType,
    _gen_newtype_hook )

With Python 3.10+ everything works out-of-the-box! (this also works when YY is imported from different module!)

c = module.c

@dataclass
class X:
    a: int

YY = NewType('YY', List[ Tuple[ Optional[ 'YY'], X] ] )

print(c.structure([([(None, {'a':2})], {'a':1})], YY) )
# -> [([(None, X(a=2))], X(a=1))]

The nice thing of NewType is that in 3.10+ the __module__ member contains the module from where NewType was called. Previously (<= 3.9, it contained typing, so is of no help)

Nevertheless:

Python 3.9+: one can use the module parameter

YY = NewType('YY', List[ Tuple[ Optional[ ForwardRef('YY', module=__name__)], X] ] )

Python <3.9: explicit evaluation

YY = NewType('YY', List[ Tuple[ Optional[ ForwardRef('YY', module=__name__)], X] ] )
typing._eval_type(YY.__supertype__, globals(), locals()) 
Tinche commented 2 years ago

Hm, interesting.

We don't support NewTypes currently (there's an open issue for that), so I don't know if I'm OK with having a NewType hook as part of this thing. I think NewTypes are potentially more important than ForwardRefs, so I cannot comment on this approach until I've looked at NewTypes in depth, which I don't have time to do now.

The first approach seemed simpler.

aha79 commented 2 years ago

@Tinche Allright. Then lets split the issue, and put the NewType handling aside. It can be added later, and, as a side-effect, improve the usability for ForwardRefs.

Should I prepare a PR for the ForwardRef handling?

Tinche commented 2 years ago

Yep, that sounds like a good start.