microsoft / pyright

Static Type Checker for Python
Other
13.12k stars 1.4k forks source link

Unsoundness with protocols for polymorphic functions #8685

Closed jonathan-laurent closed 1 month ago

jonathan-laurent commented 1 month ago

Problem

In the example below, the second call to apply_twice causes a runtime error but it is accepted by pyright 1.1.375. In contrast, it is rejected by pyright 1.1.360 and by mypy 1.11.1 (using the alternative version below that does not use PEP 695 features).

from typing import Protocol

class PolyIdentity(Protocol):
    def __call__[T](self, arg: T, /) -> T: ...

def apply_twice(f: PolyIdentity, /) -> tuple[int, str]:
    return f(1), f("a")

def id[T](x: T) -> T:
    return x

def incr(x: int) -> int:
    return x + 1

apply_twice(id)
apply_twice(incr)  # no error with pyright 1.1.375 (but error with 1.1.360)
Version without PEP 695 features ```python from typing import Protocol, TypeVar T = TypeVar("T") class PolyIdentity(Protocol): def __call__(self, arg: T, /) -> T: ... def apply_twice(f: PolyIdentity, /) -> tuple[int, str]: return f(1), f("a") def id(x: T) -> T: return x def incr(x: int) -> int: return x + 1 apply_twice(id) apply_twice(incr) # mypy error ```

Comments

I am aware that this part of the typing spec is currently ambiguous and undergoing some work, so maybe this issue should be seen as asking a question rather than reporting a bug.

In my opinion, the most natural semantics for the PolyIdentity protocol above (by far) is to describe a polymorphic function of one argument. Thus, id is an acceptable implementation but incr is not since it can only accept integers. Having such a semantics would bring some form of rank-2 polymorphism to the language.

So my questions would be:

erictraut commented 1 month ago

You're asking a good question. This would be good to clarify in the spec, probably in the protocols chapter.

Consider these two callback protocols:

class Proto1(Protocol):
    def __call__[T](self, arg: T, /) -> T: ...

class Proto2[T](Protocol):
    def __call__(self, arg: T, /) -> T: ...

The first uses a method-scoped type variable, and the second uses a class-scoped type variable. Should incr in your code sample be compatible with both of these callback protocols? Mypy says that incr is compatible with Proto2 but not Proto1. Pyright says that incr is compatible with both. The spec doesn't say one way or another. I'm not sure if this was intentional on the part of mypy or an unintended byproduct of its implementation. As you've shown, there is utility in being able to distinguish between these two cases, so it probably makes sense to formally specify mypy's behavior here. I think it makes sense to have a discussion on this topic in the Python typing forum. If there is consensus, we can then update the typing spec accordingly. Would you like to get that discussion going?

jonathan-laurent commented 1 month ago

Thanks for your answer! I will start a discussion on the typing forum.

Edit: the post is here.

erictraut commented 1 month ago

After listening to the feedback and thinking about this more, I'm going to move ahead with the proposed change. I'd still like to see this formalized in the typing spec, but it appears that there's general agreement that it's the right direction. I also think it makes sense for pyright to match mypy's current behavior in this case.

This will be included in the next release of pyright.

erictraut commented 1 month ago

This is addressed in pyright 1.1.376.