microsoft / pyright

Static Type Checker for Python
Other
13.04k stars 1.39k forks source link

Inconsistent type narrowing of union with `Literal` boolean discriminators #8376

Closed RobertCraigie closed 1 month ago

RobertCraigie commented 1 month ago

Describe the bug

from typing import Union
from typing_extensions import Literal

class ParsedMessage:
    invalid: Literal[False]
    content: int

class RefusalMessage:
    invalid: Literal[True]
    content: str

Message = Union[ParsedMessage, RefusalMessage]

def handle(message: Message):
    if message.invalid == True:
        reveal_type(message.content)  # str
    else:
        reveal_type(message.content)  # int

    if message.invalid:
        reveal_type(message.content)  # str | int
    else:
        reveal_type(message.content)  # str | int

I'd expect the types in the second if condition to be the same as the first as message.invalid == True is equivalent to message.invalid in this case.

Additional context

Mypy's behaviour is actually the inverse surprisingly (playground).

script.py:20: note: Revealed type is "Union[builtins.int, builtins.str]"
script.py:22: note: Revealed type is "Union[builtins.int, builtins.str]"
script.py:25: note: Revealed type is "builtins.str"
script.py:27: note: Revealed type is "builtins.int"

VS Code extension or command-line

CLI 1.1.371

erictraut commented 1 month ago

This behavior is expected. For a full list of supported type guards in pyright, refer to this documentation. Discriminated objects are supported only with the pattern x.y == LN where LN is a literal expression or None and x is a type that is distinguished by a field or property with a literal type. You can also use x.y is True, but discriminated objects won't work with a simple truthy/falsy check.

I explored supporting this with the truthy/falsy check, but it measurably slowed type analysis (and therefore completion suggestions, etc.). The list of supported type guard patterns is carefully curated to strike a balance between usability, maintainabilitiy, and performance.

I think it's a bit unusual to use a bool literal to discriminate between object types. More typical would be a str, int, or enum type that scales to more than just two variants. In those cases, a truthy/falsy check wouldn't make sense.

I'm going to decline this enhancement request for now, but I'm willing to reconsider it in the future if there's sufficient demand for this pattern. It's possible that I could mitigate the performance impact, but that would require some deeper investigation.

RobertCraigie commented 1 month ago

Thanks for looking into this!

Out of curiosity did you check what the performance impact was for just supporting bool for truthiness type narrowing?