microsoft / pyright

Static Type Checker for Python
Other
13.3k stars 1.45k forks source link

Incorrect narrowing in `isinstance` check of union of protocol and parametrized concrete type #8907

Closed qexat closed 1 month ago

qexat commented 1 month ago

Disclaimer-ish, I think this is a bug, but I'm not sure if it is a coincidence that MyPy also shares it with Pyright, or the consequence of something incorrect/under-specified in the Typing Spec.

Description

Given a variable v which type is the union of a protocol P (e.g. SupportsIndex) and a parametrized concrete type C[T] (e.g. tuple, list), on an instance check isinstance(v, C), v gets incorrectly narrowed to C[T] inside the branch, which can lead to a contradiction - a more concrete explanation is following.

Explanation

For the following explanation, typing is assumed to be imported.

Let's define a function foo which has a parameter value which type is the union of SupportsIndex (the protocol) and tuple[str, str] (the parametrized concrete type). Note that the tuple type parameters do not have to be concrete themselves, they can also be protocols.

def foo(value: typing.SupportsIndex | tuple[str, str]) -> None:
    if isinstance(value, tuple):
        typing.reveal_type(value)  # Pyright says: "tuple[str, str]"

Expected

def foo(value: typing.SupportsIndex | tuple[str, str]) -> None:
    if isinstance(value, tuple):
        typing.reveal_type(value)  # Pyright should say: "tuple[str, str] | tuple[Unknown, ...]" (in my opinion)

Why? Let's define the following class:

class MySillyTuple(tuple[int, int]):
    def __index__(self) -> int:
        return sum(self)

An instance of MySillyTuple can be passed to foo (as it satisfies typing.SupportsIndex) and will also satisfy isinstance(value, tuple) as it subclasses tuple. However, the contradiction with Pyright's current behavior lies in the fact that it is not a tuple[str, str].

The runtime obviously confirms this difference. If typing.reveal_type is replaced with print, then foo(MySillyTuple((3, 5)) will print (3, 5).

VS Code extension or command-line Tested with Pylance-vendored Pyright (1.1.373) and with the CLI (1.1.379).

:warning: MyPy is also subject to this.

erictraut commented 1 month ago

Pyright is working as intended here, so I don't consider this a bug.

The typing spec doesn't dictate how and when type checkers should narrow types. Each type checker applies various rules and heuristics that attempt to balance type safety with usability. Your code sample demonstrates a case where both pyright and mypy have chosen to opt for usability over theoretical correctness.

From a type theory standpoint, your analysis is correct. The theoretically correct type would be tuple[Unknown, ...] | tuple[str, str] which is equivalent to tuple[Unknown, ...]. However, that's not what most developers would expect in this case, and it's probably not what they intend. Optimizing for theoretical correctness tends to produce many false positives in typical python code. Deviating from theoretical correctness improves usability but can produce false negatives, as you've shown. Type checkers try to strike a good balance between the two extremes.

You might be interested in these slides, which I presented at a Typing Meetup last year. It shows some of the practical tradeoffs that type checkers make when implementing narrowing for isinstance checks.