python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.22k stars 2.78k forks source link

binder is too optimistic about refining attribute types in methods #4168

Open carljm opened 6 years ago

carljm commented 6 years ago

This example typechecks in strict-optional mode, but it is not type safe:

from typing import Optional 

class Foo: 
    bar: Optional[int] 

    def thing(self) -> int: 
        if self.bar is None: 
            return 0 
        self.set_bar_to_none() 
        return self.bar 

    def set_bar_to_none(self) -> None: 
        self.bar = None 
elazarg commented 6 years ago

Looks like a deep problem. I assume there's a similar problem with any "literal" that is not a local variable.

gvanrossum commented 6 years ago

The binder should probably assume that any method call on an object might modify any attributes of that object (within their declared types).

carljm commented 6 years ago

I don't know what the desired balance is between paranoid type-safety vs probably-correct-in-most-cases, but only watching for direct method calls on the object is not fully safe either, set_bar_to_none could just as well be a standalone function that takes a Foo instance.

Next step would be to consider any method call, or other call taking self, as unsafe. But even this probably leaves loopholes where there might be other references available to the current instance. Ultimately probably any intervening call is not safe.

JukkaL commented 6 years ago

This is basically the binder working as designed. As @carljm explained well, doing this safely would have to be overly restrictive.

The realistic alternatives we have is to either only use the binder for local variables, or use the current, a-little-too-optimistic approach. I think that here convenience wins over absolute safety. I suspect that if we'd restrict the binder to local variables only, many users would use (unsafe) casts instead of refactoring their code to safely conform to this restriction, actually resulting in more unsafety than we have now.

carljm commented 6 years ago

I suppose in theory (given someone who cares enough to implement it) this is the kind of behavior that could be selected by a strictness flag? On a brand-new project I could imagine preferring to opt in to the strict behavior. Typically the workaround is just bar = self.bar and use the local variable, which isn't too onerous.

JukkaL commented 6 years ago

Yeah, we could use a strictness flag for this -- but this doesn't feel like a high priority feature. Let's keep the issue open to see if there seems to be additional interest in this.

ilevkivskyi commented 6 years ago

Just a minor stylistic comment. What happens here (the side effect) is typically absent in languages with strict typing discipline (i.e. they typically only allow pure functions). This is not the case for Python, but I think it will not be a surprise to see such behaviour in relation with static types.

Also related https://github.com/python/mypy/issues/4019.

msullivan commented 6 years ago

This can happen with bindings for local variables as well, using nested functions:

def bait(x: Optional[int]) -> int:
    def switch() -> None:
        nonlocal x
        x = None

    if x is None:
        return -1
    else:
        switch()
        return x
msullivan commented 6 years ago

What happens here (the side effect) is typically absent in languages with strict typing discipline (i.e. they typically only allow pure functions). This is not the case for Python, but I think it will not be a surprise to see such behavior in relation with static types.

There are plenty of languages with strict static typing discipline that have side effects (OCaml, Rust, Scala are ones with "interesting" type systems, but also Java and C# and the like), but they typically don't support refining the types of variables based on runtime checks. Since this sort of refinement is hard to typecheck well and because in many cases a statically typed language wouldn't have runtime type information anyways, if they provide anything for this sort of use-case, it is often tagged union datatypes (as in OCaml, Rust, Scala), which fit more neatly into a statically typed world.

This sort of control-flow-based typed refinement is I think mostly seen when retrofitting types onto dynamically typed languages, so as to model existing patterns of use.

ilevkivskyi commented 5 years ago

Another example (that appeared with literal types) is:

class Result(Enum):
    good = 1
    bad = 2

def check(r: Literal[Result.good]) -> None:
    ...
def fail(t: Test) -> None:
    t.result = Result.bad

class Test:
    result: Result

t: Test
assert t.result is Result.good
fail(t)  # This invalidates the type restriction
check(t.result)  # This currently passes but should fail

Note that this can also (rarely) cause false positives with --strict-equality. In view of plans to refactor the binder I raise priority to normal.

cc @Michael0x2a

ilevkivskyi commented 4 years ago

Note, the same considerations apply to index expressions, see https://github.com/python/mypy/issues/8027