python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.35k stars 2.81k forks source link

Type narrowing is lost when a variable is in a function closure. #12719

Closed mrolle45 closed 2 years ago

mrolle45 commented 2 years ago

A variable v with an optional type has its type narrowed by if v is not None: This variable appears in the definition of a nested function. However, it how has the original optional type.

To Reproduce

X.py:

from typing import NamedTuple, Callable, Optional, TypeVar, Tuple, TYPE_CHECKING
from operator import methodcaller

class CapInfo: pass
class Body: pass

class TypesTable:
    class ConvertBase(tuple):
        def __new__(cls, get_cap : Callable[[Body], CapInfo], func : Callable[[Body], str]):
            return tuple.__new__(cls, (get_cap, func))

    class Convert(ConvertBase):
        def __new__(cls, cap_convert_fun: Callable[[str], str],
                body_convert_fun: Optional[Callable[[Body, str], str]] = None):
            get_cap = methodcaller('get_cap', 'name')
            if body_convert_fun is not None:
                if TYPE_CHECKING: reveal_type(body_convert_fun)
                body_convert_fun(Body(), '')
                def func(body: Body) -> str:
                    cap_str = get_cap(body)
                    if TYPE_CHECKING: reveal_type(body_convert_fun)
                    cap: str = body_convert_fun(body, cap_str)
                    return cap
                return super().__new__(cls, get_cap, func)
            else:
                def func2(body: Body) -> str:
                    cap: str = get_cap(body)
                    return cap
                return super().__new__(cls, get_cap, func2)

mypy X.py

Expected Behavior

X.py:17: note: Revealed type is "def (X.Body, builtins.str) -> builtins.str" X.py:21: note: Revealed type is "def (X.Body, builtins.str) -> builtins.str" Success: no issues found in 1 source file

Actual Behavior

X.py:17: note: Revealed type is "def (X.Body, builtins.str) -> builtins.str" X.py:21: note: Revealed type is "Union[def (X.Body, builtins.str) -> builtins.str, None]" X.py:22: error: "None" not callable Found 1 error in 1 file (checked 1 source file)Your Environment

Your Environment

mrolle45 commented 2 years ago

Source of the problem

When the inner FuncDef (for func) is analyzed by the type checker, it ignores information from the outer FuncDef (for __new__). It creates a brand new ConditionalTypeBinder. Thus, the information from the if body_convert_fun is not None: is lost. The Python scoping rules specify that the scope of a function extends to the body of nested functions (except for names also defined in those functions). The scope of a class, does not do so. To mimic this behavior,

I'm not sure about a case where the inner function defines a name which shadows the name in the outer function. I expect this should be OK, since the inner name will be represented by a different node than the outer name, and so it won't be seen in the outer scope's frames. I'll try it out and see.

I will make these changes to my copy of mypy and post the results soon. TTFN

erictraut commented 2 years ago

I think this is a duplicate of https://github.com/python/mypy/issues/2608.

If you're going to try to fix this, it's important to note that captured variables can be modified after they are captured. This is why it's generally unsafe for a type checker to assume that a narrowed type is applicable within the inner scope. It is safe only in the case that the type checker can prove that captured variable is not modified after it is captured.

In the sample provided at the top of this thread, the narrowed type of body_convert_fun is safe to use in the inner scope because body_convert_fun is not reassigned a value along any code path after it is captured. However, if you added the statement body_convert_fun = None immediately before the return super().__new___ ... statement, it would no longer be safe to use the narrowed type in the inner scope.

I implemented this functionality in pyright, and it works well, but it was a bit tricky to get right. The steps you've outlined above are insufficient to implement this in a type-safe manner.

mrolle45 commented 2 years ago

@erictraut I thought that an outer variable is captured by the inner function if and only if it is not bound in the inner function. This is regardless of whether the binding statement is executed or not. If the name is referenced before it is bound, you get an UnboundNameError. Is this correct? I expect that the semantic analyzer will correctly put the inner name in the FuncDef's names dict with the first node (in lexical order) that binds it, otherwise it will put it in the dict with the defining node in the outer scope. Of course, global and nonlocal declarations will refer to the matching definition, even if the name is bound within the function. Is this correct?

erictraut commented 2 years ago

You are correct that an outer variable is captured by the inner function only if there is no symbol of the same name bound to the inner function. But your sample doesn't demonstrate that case. There is no local variable body_convert_fun bound to the scope of the inner function called func, so func captures the outer variable of that name.

Here's a simplified example.

def outer():
    x = 1
    y = 1

    def inner1():
        # x is captured from the outer scope because there is no
        # local variable x bound to the `inner1` scope.
        print(x)

    def inner2():
        # y is not captured from the outer scope because there
        # is a variable of the same name bound to the `inner2` scope.
        print(y)  # Exception: Unbound local variable
        if False: y = ""

    inner1()
    inner2()

outer()

Here's a demonstration of the issue I described in my previous post.

from typing import Callable
def outer(x: int | str) -> Callable[[], int]:
    if isinstance(x, int):
        def inner1() -> int:
            return x

        # The following assignment means that the narrowed
        # type of x cannot be used within `inner1`.
        x = "gotcha!"
        return inner1
    return lambda: 0

v1 = outer(0)()
print(v1)  # gotcha!
assert isinstance(v1, int)  # AssertionError
JelleZijlstra commented 2 years ago

Agree this is #2608, thanks @erictraut and @mrolle45 for the useful discussion.