microsoft / pyright

Static Type Checker for Python
Other
12.7k stars 1.35k forks source link

In some cases, `"__getitem__" method not defined on type "UnionType"` #8319

Closed wch closed 1 week ago

wch commented 1 week ago

Here's some example code that is OK:

from typing import TypeVar

T = TypeVar("T")

ListOrTuple = list[T] | tuple[T]

ListOrTupleStr = ListOrTuple[str]

def f(x: ListOrTupleStr) -> None:
    print(ListOrTupleStr)

f([])

And here is an example where pyright 1.1.370 reports errors:

from typing import TypeVar

T = TypeVar("T")

ListOrTuple = list[T] | tuple[T]

ListOrTuple[str]   # Pyright error

def g(x: ListOrTuple[str]) -> None:
    print(ListOrTuple[str])   # Pyright error

g([])
❯ pyright test.py
/Users/winston/test.py
  /Users/winston/test.py:8:1 - error: "__getitem__" method not defined on type "UnionType" (reportIndexIssue)
  /Users/winston/test.py:12:11 - error: "__getitem__" method not defined on type "UnionType" (reportIndexIssue)
  /Users/winston/test.py:12:11 - error: Argument type is unknown
    Argument corresponds to parameter "values" in function "print" (reportUnknownArgumentType)
3 errors, 0 warnings, 0 informations 

Both versions of the code run in Python 3.12 without error.

I know that it's strange to evaluate ListOrTuple[str] at run time, but we have some existing code that did that, and pyright was OK with it before 1.1.370. Also, pyright seems to think it's OK as long as the value is saved in a variable.

erictraut commented 1 week ago

Pyright's new behavior is correct. The typing spec has recently been updated to indicate how and where type checkers should use "type expression" versus "value expression" rules. In your top example, the statement ListOrTupleStr = ListOrTuple[str] is a valid (old-style) type alias definition, and the RHS is evaluated as a type expression. In your second code sample, the statement ListOrTuple[str] is a plain old value expression, and it's treated as such. The subexpression ListOrTuple in this case is evaluated as its runtime type of UnionType.

wch commented 1 week ago

Thanks for that explanation. That part makes sense to me now.

However, the fact that the code runs means that there actually is a __getitem__ method at run time. But on the other hand, in Typeshed's stdlib/types.pyi, there is not a __getitem__ method listed there.

https://github.com/python/typeshed/blob/99c1b7102a7934fd34f7c44f545cb8fd5e6dddab/stdlib/types.pyi#L618-L624

Does this mean that when the code is evaluated as a value expression, it uses that Typeshed stub, but when it's evaluated as a type expression, it uses something else?

erictraut commented 1 week ago

Yeah, UnionType does have a __getitem__ method (contrary to what the typeshed stub indicates), but __getitem__ raises an exception at runtime if the individual subtypes are not generic types parameterized by type variables.

from typing import TypeVar
T = TypeVar("T")

(list[T] | set[T])[0] # No runtime error

(list | set)[0] # Runtime error

So it's understandable why the typeshed definition for UnionType doesn't declare a __getitem__. You could try to make the case to the typeshed maintainers that a __getitem__ method should be added.

If PEP 747 is approved in its current form (or something similar), then this will be moot because the draft PEP proposes new rules for how such expressions should be evaluated type checkers.