microsoft / pyright

Static Type Checker for Python
Other
13.04k stars 1.39k forks source link

Inconsistent syntax-dependent incompatible override error when implementing callable property on ABC/Protocol with callable descriptor #8342

Closed DylanLukes closed 1 month ago

DylanLukes commented 1 month ago

Overview

This is perhaps more than a little bit esoteric, as it involves the intersection of ABC/Protocol, Callable, and descriptors. The original motivating case is a bit more complicated and I'll include it in a follow-up comment to explain how I even got to finding this odd edge case.

In short, it is not possible (and for, I think, good reason) possible to implement an @abstractmethod with a descriptor, even if that descriptor is Callable. However, it seems like it should be possible to implement a Callable field on a Protocol using a Callable matching descriptor. It turns out this is possible, but only if one avoids using @decorated method syntax.

In spite of reveal_type disclosing exactly the same typing, Pyright behaves differently depending on the syntax used.

Example

First, let's define a sneaky class-based decorator:

class boundmethod[T, ** P, R]:
    method: Callable[Concatenate[T, P], R]
    instance: T | None

    def __init__(self, method: Callable[Concatenate[T, P], R], instance: T | None = None):
        self.method = method
        self.instance = instance

    def __get__(self, instance: T | None, owner: type[T]) -> Callable[P, R]:
        callable = self.__class__(self.method, instance)
        functools.update_wrapper(callable, self.method)
        return callable

    def __call__(self, *args: P.args, **kwargs: P.kwargs) -> R:
        if self.instance is None:
            raise TypeError("boundmethod called directly, not on instance")
        return self.method(self.instance, *args, **kwargs)

In practice this has no immediately effect when applied to a method, except that the method can no longer be called through a __dict__ lookup and passing the instance in as the first argument. So, it's very close to behaving identically to the original method, but it is not a method.

class Bar:
    @boundmethod
    def foo() -> str:
        return "hello from Foo"
    reveal_type(foo)
> pyright --strict scratch.py
scratch.py:xx:x - information: Type of "foo" is "boundmethod[Self@Bar, (), str]"
>>> b = Bar()
>>> b.foo()
"hello from Foo"
>>> Bar.__dict__['foo'].__call__(b)
TypeError: boundmethod called directly, not on instance

Inheritance: Case A

It being the case that this callable descriptor is not actually a method, and this would fail at runtime, it makes sense that attempting to implement a Protocol or ABC) method with a boundmethod descriptor should fail. The signatures (even taken just as callables) don't match (differing positional-only/bound parameters).

class BaseA(Protocol):
    def foo(self) -> str: ...

class DerivedA(BaseA):
    @boundmethod
    def foo(self) -> str:
        return "hello from DerivedA"
    reveal_type(foo)
> pyright --strict scratch.py
scratch.py:xx:x - error: "foo" overrides method of same name in class "BaseA" with incompatible type "boundmethod[DerivedA, (), str]" (reportIncompatibleMethodOverride)
scratch.py:xx:x - information: Type of "foo" is "boundmethod[Self@DerivedA, (), str]"

So far so good. This is the correct judgement on behalf of the type checker.

Inheritance: Case B

Now, what if the base protocol does not define a method but instead a Callable field, which our boundmethod descriptor does properly implement? Well, oddly enough in this case it's syntax-dependent, even though the resulting classes are effectively identical.

Let's try it two ways, as a decorated method (DerivedB1) and using an extra private method and initializing the descriptor directly (DerivedB2).

class BaseB(Protocol):
    foo: Callable[[], str]

class DerivedB1(BaseB):
    @boundmethod
    def foo(self) -> str:
        return "hello from DerivedB1"
    reveal_type(foo)  # (1)

class DerivedB2(BaseB):
    def _foo(self) -> str:
        return "hello from DerivedB2"

    foo = boundmethod(_foo)
    reveal_type(foo)  # (2)

if __name__ == "__main__":
    reveal_type(BaseB.foo)  # (3)
    reveal_type(DerivedB1.foo)  # (4)
    reveal_type(DerivedB2.foo)   # (5)
    assert DerivedB1().foo() == "hello from DerivedB1"
    assert DerivedB2().foo() == "hello from DerivedB2"

And now here we get the following type checking output.

> pyright --strict scratch.py
scratch.py:xx:x - error: "foo" overrides method of same name in class "BaseA" with incompatible type "boundmethod[DerivedB1, (), str]" (reportIncompatibleMethodOverride)
scratch.py:xx:x - information: Type of "foo" is "boundmethod[Self@DerivedB1, (), str]" (1)
scratch.py:xx:x - information: Type of "foo" is "boundmethod[Self@DerivedB2, (), str]" (2)
scratch.py:xx:x - information: Type of "foo" is "() -> str" (3)
scratch.py:xx:x - information: Type of "foo" is "() -> str" (4)
scratch.py:xx:x - information: Type of "foo" is "() -> str" (5)

Notably, we do not get a type error for DerivedB2, despite the fact that the types of foo as revealed inside the class definition and outside in usage are indistinguishably isomorphic between DerivedB1 and DerivedB2.

This seems a bit odd to me, it seems like intuitively, it shouldn't matter if decorator syntax is used, or if the descriptor is initialized directly in the class body. That is, @decorator; def foo(): ... and def _foo(): ...; foo = decorator(_foo) should be typed equivalently, no?

System Details

DylanLukes commented 1 month ago

As for the original motivating case... which is admittedly a pretty odd thing to do, the goal was to define a well-typed Singleton metaclass/base-class which supported a @singletonmethod decorator on methods. This would behave much like @classmethod except the resulting method would always be called on the singleton instance, regardless of whether it's called on the class or an instance.

But the wrinkle came in making this work for Singletons that inherit from non-Singleton abstract base classes...

There are probably all kinds of reasons this should not be done, and I wouldn't be surprised if the typechecker shouldn't be allowing this...

The (frankly somewhat evil code) looks something like this:

```py class _SingletonMeta(ABCMeta): _instances: dict[type, Any] = {} def __call__(cls, *args: Any, **kwargs: Any): if cls not in cls._instances: cls._instances[cls] = super(_SingletonMeta, cls).__call__(*args, **kwargs) return cls._instances[cls] class Singleton(metaclass=_SingletonMeta): @classmethod def get_instance(cls: type[Singleton]) -> Singleton: return cls() class singletonmethod[T: Singleton, ** P, R]: """Descriptor for a method which when called on the class, delegates to the singleton instance.""" method: Callable[Concatenate[T, P], R] instance: T | None def __init__(self, method: Callable[Concatenate[T, P], R], instance: T | None = None): self.method = method self.instance = instance def __get__(self, instance: T | None, owner: type[T]) -> Callable[P, R]: instance = instance or owner.get_instance() callable = self.__class__(self.method, instance) functools.update_wrapper(callable, self.method) return callable def __call__(self, *args: P.args, **kwargs: P.kwargs) -> R: if self.instance is None: raise RuntimeError("singletonmethod instance is not set") return self.method(self.instance, *args, **kwargs) @property def __isabstractmethod__(self): return getattr(self.method, '__isabstractmethod__', False) class instancemethod[T, ** P, R]: # Matching descriptor/callable as singletonmethod, but # identical to boundmethod in the original post, more or less def __get__(self, instance: T | None, owner: type[T]) -> Callable[P, R]: ... def __call__(self, *args: P.args, **kwargs: P.kwargs) -> R: ... ``` And then the following typechecks and runs without issue: ```py class Base(ABC): foo: Callable[[], str] class DerivedNormal(Base): def _foo(self) -> str: return "foo" foo = instancemethod(_foo) class DerivedSingleton(Base, Singleton): def _foo(self) -> str: return "foo" foo = singletonmethod(_foo) def test_do_evil_deeds(): assert "foo" == DerivedNormal().foo() # notably, DerivedNormal.foo() is now a runtime-only error (!!) assert "foo" == DerivedSingleton.foo() == DerivedSingleton().foo() ```

My kingdom to avoid a # pyright: ignore[reportIncompatibleMethodOverride]! 😅

erictraut commented 1 month ago

You're wandering into very murky areas of the typing spec here.

A few observations:

  1. In class BaseB, you're declaring foo as an instance variable. I suspect that you intended to declare it as a ClassVar? A def statement (which you use to define foo in DerivedA) describes a class variable, not an instance variable. So if you want these two cases to be even close to equivalent, you'd need to at least mark BaseB.foo as a ClassVar.
  2. The typing spec currently doesn't define the rules for override consistency checks. For example, can a class variable be overridden by an instance variable or vice versa? This is one of many issues on a list that I'd like to eventually specify and standardize. For now, it's left up to each type checker to decide on the rules they want to follow.
  3. The typing spec is also currently murky on how to distinguish a "pure instance variable" from a "pure class variable" from a "regular class variable" that can be shadowed by an instance variable of the same type. Here is how pyright treats them currently, but future updates to the typing spec could redefine how this works.
  4. The typing spec also doesn't currently define most of the rules regarding instance methods, class methods, and static methods and how they should be interpreted when passed through a ParamSpec. This is another thing I'd like to see specified.
  5. The typing spec also doesn't currently describe how a Callable variable in a Protocol or a non-protocol class (whether it's defined as a ClassVar or not) should be treated. These are tricky because they are treated differently depending on whether they are bound to a class or a class instance (an object). This is another thing that I'd like to see specified at some point in the future.
  6. Looking at mypy's behavior on your code sample , it does not generate an "incompatible override" error for DerivedA.foo or DerivedB2.foo, but it does generate one for DerivedB1.foo. As you've noted, pyright generates an error for both DerivedA.foo and DerivedB1.foo but does not generate one for DerivedB2.foo. Not surprisingly, the two type checkers differ in how they behave here. That's not surprising given the lack of specifications.

In general, if you start to intermix Callable variables and def statements, you're going to see some oddities. I recommend avoiding this if possible. This gets especially hairy if you throw ParamSpec-based decorators into the mix.

Until and until some of the above gray areas are clarified in the typing spec, I'm reluctant to change the logic in pyright in these areas. Doing so is likely to just create churn for existing pyright users, and it would likely need to change again once when the typing spec is updated.

I'm therefore going to close this issue for the time being. I'm willing to revisit it if and when the above rules are clarified in the typing spec. If you would like to participate in that spec'ing process, the Python typing forum is a good place to start. This page provides information about the process we use to consider changes and additions to the typing spec.

DylanLukes commented 1 month ago

Thank you for the detailed reply. Confirms my suspicions but also offers some food for thought.

I'm reluctant to change the logic in pyright in these areas.

Yeah, absolutely. I'm clearly wandering off the well-charted areas of the map here so locking anything in seems unwise.

The typing spec also doesn't currently define most of the rules regarding instance methods, class methods, and static methods and how they should be interpreted when passed through a ParamSpec. This is another thing I'd like to see specified.

I wonder if it might (eventually) be possible to formalize a set of descriptor protocols and then if not make absolute decisions at least formalize which component specifications are included. That is, to say "this is the protocol for a descriptor representing a class/instance variable/method" and then be able to specify the behavior of the type checker in terms of those primitives. For example "absent explicit class var annotation, a field of type T is implicitly "promoted" to Union[ClassVar[T] | InstanceVar[T]] where both types have overlapping subsets of overloads on __get__/__set__/__delete__ or something like that.

Maybe I'm off base here but it seems like the kind of behavior I was trying to achieve would be more achievable if such descriptor protocols could be explicitly stated in Protocols, and it might also lead to some clarity on issues such as "when can a field implement a protocol property or vice versa", since while the two look the same in many use cases they differ in the behavior of the underlying descriptors. For example, if I could in a type safe manner specify "I really just want a thing that I can __get__ with this type when access on an instance, that type when accessed on the class, and some other type when accessed independently through __dict__ or the C API. [1]

I might also just be a bit out of my depth in the above. But, it seems to me that there ought to be some sort of formalization that works, though it's not clear to me if the intersection of the semantics of that formalization and subtyping in Python are easily reconcilable (my guess is they're not).

The discussion about PEP 746 and __supports_type__ seems potentially relevant as it pertains to Callable types (or even MethodType and its ilk).

[1] One obstacle I've run into here is that it's not possible to use Concatenate only with P.args for a P ParamSpec, so one can't write an overload for a method which might take a self parameter (Concatenate[T, P.args]) or not P.args. If this were possible, it might be feasible to type such callable descriptors as above to have different __call__ signatures depending on their usages.

erictraut commented 1 month ago

I think these ideas are worth exploring. However, the Python typing forum would be a better place to have this discussion. This will give it the attention and visibility of the full Python typing community.