microsoft / pyright

Static Type Checker for Python
Other
13.43k stars 1.47k forks source link

Type bound methods to something equivalent to a `Protocol` with `__call__`, `__self__` and `__name__` #2524

Closed Tomaz-Vieira closed 3 years ago

Tomaz-Vieira commented 3 years ago

Is your feature request related to a problem? Please describe. I would like to be able to type-safely access the bound __self__ object from a bound method.

Describe the solution you'd like Ideally, accessing a bound method like:

class MyClass:
    def do_something(self, a: int) -> str:
        return "something"

MyClass().do_something

would return something like BoundMethodOf[MyClass], looking something like the following protocol:

class BoundMethodOf(Protocol[T]):
    __self__:  T
    __name__: str
    __doc__:  Optional[str]
    def  __call__(self, a: int) -> str
        ...

Revealing the type of a function currently does not show up as a Callable[...], so I imagine that functions are already somewhat special (I don't think it's possible to type a variable so that it only takes actual functions instead of callable objects), so I was wondering if they could be made even more special with the extra attributes, without breaking everything else. I understand, though, that this might either be unfeasible, or require cooperation from typeshed, or violate some typing PEP, or at least end up in a situation where code checks under pyright but not under mypy.

I'm circumventing this right now by using an ugly (and probably unsound) mixture of descriptors and decorators, but it feels like I could simplify a lot if bound methods knew that they belonged to an object.

erictraut commented 3 years ago

I wouldn't want to introduce a custom BoundMethodOf type that appears in text form when revealing the type of a bound method. That would be non-standard and confusing for users.

You can already access the __name__ and __doc__ fields without pyright complaining. That works for any callable. Actually, you can access any field included in the function or object classes defined in builtins.pyi.

It sounds like the main functionality you're looking for is the addition of a __self__ field? What about class methods that have been bound to a class? Do they have a __cls__ field?

Tomaz-Vieira commented 3 years ago

That would be non-standard and confusing for users

I agree. I'm not super familiar with typeshed, but now that you mentioned builtins.pyi, I see there that there's a MethodType in types.pyi, which looks like what I had in mind, except that it types __self__ to object instead of a generic T. Could we use that (and even function) or is that supposed to be internal to type checkers?

It sounds like the main functionality you're looking for is the addition of a self field?

Pretty much, but I wanted it to be correctly typed to the instance type, and only accessible for bound methods (not for any callable). Here are some situations where I'd want more type awareness:

  1. One can access __self__ from non-method functions. This may just be a bug, since I can also access any dunder property without compiler errors for any function or method, but ideally there should be no __self__ for standalone functions :
    
    def myfunction(a: int) -> str:
    return "asd"

x = myfunction.self # no reported error, but bad during runtime. y = myfunction.this_could_be_a_bug # still no reported error


2. I was hoping a method's `__self__` could be typed-safe:

```python3
class MyClass:
    def do_something(self) -> int:
        return 123

# `bound_self` is Any instead of MyClass
bound_self = MyClass().do_something.__self__
  1. I also wanted to be able to express that a function's parameter should be a method from a particular type:
class Producer:
    def produce(self) -> int:
        return 123

class Consumer:
    def __init__(self, source: MethodType[Producer, int]):
        self.producer: Producer = source.__self__

What about class methods that have been bound to a class? Do they have a cls field?

It seems class methods also have a __self__ attribute, whose value is the class itself, which seems consistent to me, and would be something like MethodType[Type[MyClass]]

erictraut commented 3 years ago

You can enable reportFunctionMemberAccess if you want pyright to emit an error for unknown attributes on a function.

MethodType isn't appropriate in this case. It's a type that represents the internal representation in the Python interpreter. Typical Python developers would be confused if they saw it show up in hover text or error messages. In general, I tell users that if they ever import a symbol from types.pyi, they're probably doing something wrong. Only tools like mypy and pylint that are written in Python and introspect the internals of the Python AST should use those classes.

I think it's reasonable to special-case the __self__ attribute on a bound function. Pyright already maintains the information about what type is bound to a method, and it already special-cases the __defaults__ attribute for functions, since that attribute is not currently included in the function class in builtins.pyi. Note that if I make this change, it will likely be specific to pyright. If you want this to also work within mypy and other type checkers, you would need to lobby for its inclusion there as well.

I don't see a way to express the function's parameter in an explicit type annotation (like you've attempted to do above with MethodType[Producer, int]). That would require an extension to the Python type system, necessitating a discussion in python/typing or the typing-sig mailing list and likely a new PEP or an extension to an existing PEP.

erictraut commented 3 years ago

Support for __self__ will be included in the next release. It generates a reportFunctionMemberAccess error if you try to access it with an unbound method (reflecting the runtime behavior). If the method is bound, the type of the __self__ reflects the type of the object or class that it is bound to.

Tomaz-Vieira commented 3 years ago

You can enable reportFunctionMemberAccess if you want pyright to emit an error for unknown attributes on a function.

Indeed, I forgot to add strict. I see that only __self__ is missing now.

Support for __self__ will be included in the next release.

Awesome!

Will this special-casing of __self__ be enough to match methods with my own Protocol like:

_SELF = TypeVar("_SELF")

class MyMethod(Protocol[_SELF]):
    __self__: _SELF

    def __call__(self, a: int) -> str:
        ...

or would it just make it so that pyright knows that MyClass().do_something.__self__ exists and is of type MyClass?

erictraut commented 3 years ago

No, this won't work for protocol matching. This special-case handling applies only to member access expressions of the form <function>.__self__.

Tomaz-Vieira commented 3 years ago

Do you think there is a good way to achieve that? If this is unrepresentable now, then maybe I should just bring it up on the python/typing repo or in the mailing list like you mentioned

erictraut commented 3 years ago

This is included in pyright 1.1.185, which I just published. It will also be included in the next release of pylance.