microsoft / pyright

Static Type Checker for Python
Other
13.14k stars 1.4k forks source link

Is it possible to support dataclass init signature inference with descriptor-typed fields? #8750

Closed alwaysmpe closed 1 month ago

alwaysmpe commented 1 month ago

Pyright currently supports descriptor types but the init signature of dataclasses with descriptor fields aren't always consistent with the documented behaviour. In particular, if a descriptor's __get__ method raises an AttributeError if the passed object is None, the dataclass's constructor requires a value, but pyright doesn't currently support this. A simple example:

from typing import overload, Self, NoReturn, reveal_type
from dataclasses import dataclass
import inspect

class Int[SelfT]:
    attrname: str = "unknown"
    def __init__(self) -> None:
        pass

    @overload
    def __get__(self, obj: None, objtype: type[SelfT] | None = None) -> NoReturn:
        ...

    @overload
    def __get__(self, obj: SelfT, objtype: type[SelfT] | None = None) -> int:
        ...

    def __get__(self, obj: SelfT | None, objtype: type[SelfT] | None = None) -> int:
        if obj is None:
            raise AttributeError
        return getattr(obj, f"_{self.attrname}")

    def __set_name__(self, owner: SelfT, name: str):
        self.attrname = name

    def __set__(self, obj: SelfT, value: int):
        return setattr(obj, f"_{self.attrname}", value)

class DefaultInt[SelfT]:
    attrname: str = "unknown"
    val: int
    def __init__(self, default: int = 0) -> None:
        self.val = default

    def __get__(self, obj: SelfT | None, objtype: type[SelfT] | None = None) -> int:
        return getattr(obj, f"_{self.attrname}", self.val)

    def __set_name__(self, owner: SelfT, name: str):
        self.attrname = name

    def __set__(self, obj: SelfT, value: int):
        self.val = value

@dataclass
class HasInt:
    foo: Int[Self] = Int()
    bar: DefaultInt[Self] = DefaultInt()

# actual pyright message: Type of "HasInt.__init__" is "(self: HasInt, foo: int = ..., bar: int = ...) -> None"
# ideal pyright message: Type of "HasInt.__init__" is "(self: HasInt, foo: int, bar: int = 0) -> None"
reveal_type(HasInt.__init__)
# runtime signature: (self, foo: __main__.Int[typing.Self], bar: __main__.DefaultInt[typing.Self] = 0) -> None
print(inspect.signature(HasInt.__init__))

# no type check error - false negative
# runtime error: TypeError: HasInt.__init__() missing 1 required positional argument: 'foo'
baz = HasInt()

Is your feature request related to a problem? Please describe.

What I'm actually using this for is a parser that's streaming data and optionally caching values parsed from that data and clearing the cache when an index is incremented. The index is the descriptor-typed field, other property functions are wrapped using a decorator provided by the descriptor-typed field. This is a different use case to functools.lru_cache.

erictraut commented 1 month ago

This seems like a pretty obscure edge case, and this is a false negative rather than a false positive, so that makes it less important from my perspective.

For what it's worth, mypy doesn't support this either.

I'm also not convinced that your sample would be the right way to code this. You've provided an overload for Int.__get__ that accepts None for the obj parameter. This overload has an annotated type of NoReturn, but that doesn't mean this isn't callable. It simply means that the call doesn't return control back to the caller. I don't think there are any other places in the typing spec or in pyright's logic where a NoReturn type specifically means "raises an exception". It would be better to not provide this overload if you want to indicate that this method isn't callable with a None argument.

For now, I'm going to reject the enhancement request, but I'd reconsider this decision in the future if there's signal from other pyright users that it's something they'd like to see.