microsoft / pyright

Static Type Checker for Python
Other
13.26k stars 1.44k forks source link

Incorrect(?) type inference in list comprehension refering to an outer variable #7892

Closed tomasr8 closed 5 months ago

tomasr8 commented 5 months ago

First of all, thanks for this great tool! I just started using it recently and it's been great so far :)

Describe the bug

I have this simplified code:

from dataclasses import dataclass

@dataclass
class Container:
    keys: list[str | None]

stuff = [Container(["foo", "bar"]), Container([None, "baz"])]

for i in range(2):
    keys = [v.keys[i] for v in stuff if isinstance(v.keys[i], str)]
    reveal_type(keys)  # -> list[str | None]
    keys = [key for v in stuff if isinstance(key := v.keys[i], str)]
    reveal_type(keys)  # -> list[str]

I'm trying to use isinstance to narrow the type of keys to be list[str] by excluding None from the union in Container.keys. However, in the first list comprehension, the revealed type remains list[str | None]. The revealed type of the the second comprehension where I explicitly bind to a local variable using := is list[str] as expected.

I would expect the first comprehension to also have the type list[str]. Is this expected behaviour? Maybe it has something to do with the fact that i could theoretically change between the evaluation of isinstance() and v.keys[i]?

command-line

pyright version 1.1.362

Perhaps related issues: #4103 #6067

erictraut commented 5 months ago

I'm glad to hear you're finding pyright useful.

Pyright's behavior here is correct, so I don't consider this a bug.

The isinstance type guard narrows the type of the expression passed as the first argument. In your first list comprehension, you're narrowing the expression v.keys[i]. It doesn't narrow the expression v in this case, so the type of v remains Container, and the type of v.keys[i] is str | None.

In your second list comprehension, isinstance is narrowing the type of key from str | None to just str.

For more details about type guards, refer to this documentation.

tomasr8 commented 5 months ago

Ok, got it. Thanks for the explanation!

erictraut commented 5 months ago

One small correction to my previous explanation... I mentioned that in the first list comprehension, pyright is narrowing the expression v.keys[i]. This isn't correct because the expression involves a dynamic subscript. Pyright narrows index expressions only in the case where the subscript is a single integer literal, like v.keys[1]. It's generally unsafe to narrow expressions with dynamic (variable) subscripts. This limitation exists in other static type checkers like mypy as well.

By assigning the expression v.keys[i] to a temporary variable key, you avoid this issue because the value of key cannot change from where it is assigned to where it is consumed, so it is safe to apply type narrowing in this case.

tomasr8 commented 5 months ago

By assigning the expression v.keys[i] to a temporary variable key, you avoid this issue because the value of key cannot change from where it is assigned to where it is consumed, so it is safe to apply type narrowing in this case.

This was my original assumption as well. Thanks again for the explanation!