python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.23k stars 2.79k forks source link

Binder should reason about complex expressions #2199

Open gvanrossum opened 7 years ago

gvanrossum commented 7 years ago

Here's a very simple example:

from typing import *
a = ['x', None]  # type: List[Optional[str]]
for i in range(len(a)):
    if a[i]:
        print(i, '<' + a[i] + '>')

We get an error on the last line:

__tmp__.py:5: error: Unsupported operand types for + ("str" and "Optional[str]")

Of course I can refactor this by using

for i in range(len(a)):
    ai = a[i]
    if ai:
        print(i, '<' + ai + '>')

but that's just busy-work and doesn't necessarily make the code more readable.

(I found this in some mypy test code where the actual expression was p[i].arg so this probably needs to become fairly powerful.)

ddfisher commented 7 years ago

Yeah, this looks like a general binder feature to me.

ddfisher commented 7 years ago

Actually, these things are pretty difficult to do correctly/safely, because you have to worry about e.g.:

from typing import *
a = ['x', None]  # type: List[Optional[str]]
for i in range(len(a)):
    if a[i]:
        a[0] = None  # how do we detect something like this?
        print(i, '<' + a[i] + '>')
gvanrossum commented 7 years ago

I agree it's complicated (and it would be even more complicated if we considered other threads). I'm just pointing out that if we have a lot of code like this (and I've seen a lot of it in Dropbox repos) is all needs to be refactored just to shut up --strict-optional, and that's tedious.

JukkaL commented 7 years ago

The binder already isn't safe, so this wouldn't fundamentally change anything. The binder currently seems to handle a[0] but not a[i]. Adding support for the latter hopefully wouldn't be hard. [These only seem to work with isinstance tests. A test such as if a[0]: doesn't seem to be supported, but I assume that it wouldn't be hard to fix.]

Example of unsafety:

from typing import *
a = ['x', 1]  # type: List[Union[int, str]]

for i in range(len(a)):
    if isinstance(a[0], int):
        a[i] = 'x'
        reveal_type(a[0])  # int, even though it's clearly str
gvanrossum commented 7 years ago

How feasible would it be to make the binder notice such a potentially conflicting assignments and undo what it has learned about a[0]? (I guess this could be made arbitrarily difficult by trying to consider all forms of aliasing, but I'm not asking for that -- just simple cases. I expect that cases like this last example are rare but do occur. I also assume that most cases where an inference about a complex expression is made and then used later cover only a very small area (often just within one test, e.g. if a[i].arg is not None and valid(a[i].arg):).

JukkaL commented 7 years ago

I have no idea. We could perhaps generate a sample of such things from a large corpus and manually analyze them. I doubt we can reason this from first principles.

ilevkivskyi commented 7 years ago

I agree it's complicated (and it would be even more complicated if we considered other threads). I'm just pointing out that if we have a lot of code like this (and I've seen a lot of it in Dropbox repos) is all needs to be refactored just to shut up --strict-optional, and that's tedious.

There is a lot of code like this in mypy itself: if x[i] is not None: ..., if x.attr is not None: ..., and None not in x (for collections). I think it makes sense to raise priority to normal here.