python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.36k stars 2.81k forks source link

Discrepancy between `if cond: a / else: b` and `a if cond else b` #12036

Open bwo opened 2 years ago

bwo commented 2 years ago

mypy (v 0.931) treats if: / else: blocks differently from inline a if c else b expressions.

To Reproduce

Consider the following functions, which are incorrectly typed given the actual implementations:

from typing import Optional, Union, Text

def ensure_unicode1(t: Optional[Union[str, Text]]) -> Optional[Text]:
    if t is None:
        return None
    elif isinstance(t, bytes):
        return t.decode('utf-8')
    return t

def ensure_unicode2(t: Optional[Union[str, Text]]) -> Optional[Text]:
    if t is None:
        return None
    return t.decode('utf-8') if isinstance(t, bytes) else t   # <--- line 15

The point isn't that the types and the implementations don't really make sense (at least, not under Python 3); the point is that this happens when you run mypy (v 0.931) against them:

foo.py:15: error: "str" has no attribute "decode"; maybe "encode"?                                                                                                         
Found 1 error in 1 file (checked 1 source file)  

Expected Behavior

Either mypy would report an issue in both functions or (preferably) neither function.

Actual Behavior

It reports an error in only one function.

I'm pretty sure that this is a regression.

A5rocks commented 2 years ago

I guess ternary if does not narrow to <never>. Here's output with --warn-unreachable enabled:

main.py:7: error: Subclass of "str" and "bytes" cannot exist: would have incompatible method signatures
main.py:8: error: Statement is unreachable
main.py:15: error: "str" has no attribute "decode"; maybe "encode"?
main.py:15: error: Subclass of "str" and "bytes" cannot exist: would have incompatible method signatures
Found 4 errors in 1 file (checked 1 source file)

This is because typing.Text is either an alias to bytes or str depending on the python version.

I suppose the general fix is to allow typechecking a range of python versions (I believe this isn't possible atm with mypy's configuration?) and the specific fix is to narrow to unreachable (which then isn't checked) in ternary ifs.

bwo commented 2 years ago

This is because typing.Text is either an alias to bytes or str depending on the python version.

This is not the issue. It is easy to see that in your run, mypy is treating the file as being python 3: for one thing, if you run mypy -2 against it, it immediately bails because the file is syntactically invalid Python 2; for another, under Python 2, line 8 would be reachable. And you get the same output if you pass an explicit --python-version argument specifying some version of python 3.

A5rocks commented 2 years ago

Sorry if I came off a bit aggressive, wrote this up too quickly :P

I believe that is the issue, in this case. This typechecks just fine:

from typing import Optional, Union

Text = Union[str, bytes]

def ensure_unicode1(t: Optional[Union[str, Text]]) -> Optional[Text]:
    if t is None:
        return None
    elif isinstance(t, bytes):
        return t.decode('utf-8')
    return t

def ensure_unicode2(t: Optional[Union[str, Text]]) -> Optional[Text]:
    if t is None:
        return None
    return t.decode('utf-8') if isinstance(t, bytes) else t

I proposed the more general solution (even though it would not help you due to syntax), as I imagine others may run into this and I know I personally want it!

bwo commented 2 years ago

The issue is not "how do I provide the right type for these functions?". The issue is that mypy should not provide divergent output in these two cases.

bwo commented 2 years ago

To make that more concrete, the same divergence is present like this (the Union is not necessary):

from typing import Optional

def ensure_unicode1(t: Optional[str]) -> Optional[str]:
    if t is None:
        return None
    elif isinstance(t, bytes):
        return t.decode('utf-8')
    return t

def ensure_unicode2(t: Optional[str]) -> Optional[str]:
    if t is None:
        return None
    return t.decode('utf-8') if isinstance(t, bytes) else t   # <--- line 15
A5rocks commented 2 years ago

Oh is that how you understood my comment. My bad, I should have written it more clearly. I suppose the most important parts of it are:

I guess ternary if does not narrow to <never>.

And

I suppose [...] the specific fix is to narrow to unreachable (which then isn't checked) in ternary ifs.

Though I somewhat misunderstood your intent too: I didn't notice you wrote "The point isn't that the types and the implementations don't really make sense," oops.

bwo commented 2 years ago

This is definitely a regression: this typechecks without issue in 0.790