Variable becoming "| None" inside a nested loop

bellini666 commented 2 years ago

After updating to 1.1.275, one of my codes started producing an error. Don't know exactly how to describe this, but here is a sample code that reproduces it.

from typing_extensions import Self

def get_str_or_none() -> str | None:
    ...

def test_with_str(x: str | None):
    x = x or ""
    y = get_str_or_none()

    reveal_type(x)
    reveal_type(y)

    if y is not None:
        x += y  # this works correctly

    reveal_type(x)
    reveal_type(y)
    for i in range(10):
        for j in range(10):
            y = get_str_or_none()
            if y is not None:
                reveal_type(x)  # pyright thinks x can be None here, but it can't
                reveal_type(y)
                x = x + y

    reveal_type(x)

class SomeType:
    def __or__(self, other: "SomeType") -> Self:
        ...

def get_some_type_or_none() -> SomeType | None:
    ...

def test_with_some_type(x: SomeType | None):
    x = x or SomeType()
    y = get_some_type_or_none()

    reveal_type(x)
    reveal_type(y)

    if y is not None:
        x |= y  # this works correctly

    reveal_type(x)
    reveal_type(y)
    for i in range(10):
        for j in range(10):
            y = get_some_type_or_none()
            if y is not None:
                reveal_type(x)
                reveal_type(y)
                x |= y  # pyright thinks x can be None here, but it can't

    reveal_type(x)

The pyright output:

❯ pyright test.py
No configuration file found.
No pyproject.toml file found.
stubPath /tmp/typings is not a valid directory.
Assuming Python platform Linux
Searching for source files
Found 1 source file
pyright 1.1.275
/tmp/test.py
  /tmp/test.py:12:17 - information: Type of "x" is "str"
  /tmp/test.py:13:17 - information: Type of "y" is "str | None"
  /tmp/test.py:18:17 - information: Type of "x" is "str"
  /tmp/test.py:19:17 - information: Type of "y" is "str | None"
  /tmp/test.py:24:29 - information: Type of "x" is "str | None"
  /tmp/test.py:25:29 - information: Type of "y" is "str"
  /tmp/test.py:26:17 - error: Operator "+=" not supported for types "str | None" and "str"
    Operator "+" not supported for types "None" and "str" (reportGeneralTypeIssues)
  /tmp/test.py:28:17 - information: Type of "x" is "str | None"
  /tmp/test.py:47:17 - information: Type of "x" is "SomeType"
  /tmp/test.py:48:17 - information: Type of "y" is "SomeType | None"
  /tmp/test.py:53:17 - information: Type of "x" is "SomeType"
  /tmp/test.py:54:17 - information: Type of "y" is "SomeType | None"
  /tmp/test.py:59:29 - information: Type of "x" is "SomeType | None"
  /tmp/test.py:60:29 - information: Type of "y" is "SomeType"
  /tmp/test.py:61:17 - error: Operator "|=" not supported for types "SomeType | None" and "SomeType"
    Operator "|" not supported for types "None" and "SomeType" (reportGeneralTypeIssues)
  /tmp/test.py:63:17 - information: Type of "x" is "SomeType | None"
2 errors, 0 warnings, 14 informations 
Completed in 0.599sec

Some notes: 1) If I remove the | None from the x argument, the error doesn't happen, even though it would not make a difference because of the first line in each func

2) The error starts happening after 2 levels of for loops. A single loop doesn't trigger the issue

The test_with_some_type is a simplified version of what I found in my code. The test_with_str was to test if the issue was specifically with the usage of __or__ with custom classes or if it affected simpler types as well, which it does.

erictraut commented 1 year ago

Here's a more minimal repro:

def func1(x: str | None):
    assert x is not None

    for i in range(10):
        for j in range(10):
                x = x + ""

debonte commented 1 year ago

This is a regression introduced by the fix for #4023. I walk through the differences in behavior before and after that fix below in gory detail.

TL/DR: I believe the root cause of this issue is that when the TypeEvaluator is determining the type of x + "" and calls back into the CodeFlowEngine, getTypeFromLoopFlowNode now returns undefined as the type of x rather than str. This is due to a change in when we evaluate the code flow node for x = x + "" -- which is called Assign@8416[6:17] in the graph below. Returning undefined from getFlowTypeOfReference causes getTypeOfName to fall back to using the annotated type Union[str, None]. Everything goes downhill from there.

At the moment I'm uncertain how to fix this in a way that does not break #4023. But I wanted to document what I know so far in case others have suggestions.

getTypeFromCodeFlow@8415: x[6:21]
Assign@8415[5:13] ─ Loop@8412 ┬ Call@8411[5:18] ─── Assign@8410[4:9] ──────────── Loop@8407 ┬ Call@8406[4:14] ───── True@8404[2:12] ─ Assign@8400[1:11] ─ Start@8399  
                              │                                                             ╰ Circular(Loop@8412)                                           
                              ╰ Assign@8416[6:17] ─ Circular(Assign@8415[5:13])

Here's what I'm seeing: The CodeFlowEngine first follows the "top" path in the graph (i.e. the first antecedents of both loops) and determines that the type of x, after being narrowed by the assert, is str. So far so good.

Then we move on to analyzing the second antecedent of Loop@8407, which is Circular(Loop@8412). At this point the cache entry for Loop@8412 looks like the following because we're still in the process of evaluating its first antecedent (Call@8411[5:18]) and we haven't looked at its second antecedent (Assign@8416[6:17]) at all yet:

type = undefined, isIncomplete = true, generationCount = 2, incompleteSubtypes =
    [ type = undefined, isIncomplete = true, isPending = true, evaluationCount = 0 ] # incompleteSubtype for Call@8411[5:18]

What happens after this point differs based on whether you have the fix for #4023 or not.

Before the fix for #4023, the following events occurred:

getTypeFromLoopFlowNode for Circular(Loop@8412) exits early because (as shown in the cache entry above) at least one of the incompleteSubtypes has isPending = true. Note that after the fix for 4023, we would not exit early here because we don't have incompleteSubtypes for all of the antecedents yet. getTypeFromLoopFlowNode returns type = undefined, isIncomplete = true.

This results in the following cache entry for Loop@8407 and therefore getTypeFromLoopFlowNode for Loop@8407 returns an effectiveType of str.

type = str, isIncomplete = true, incompleteSubtypes = 
[type = str, isIncomplete = false]      # incompleteSubtype for Call@8406[4:14]
[type = undefined, isIncomplete = true] # incompleteSubtype for Circular(Loop@8412)

str is then passed back down the "top" path of the graph and is therefore also the cached type of the first antecedent of Loop@8412, which is Call@8411[5:18] .
We then call getTypeForFlowNode on the second antecedent of Loop@8412 which is Assign@8416[6:17]. Evaluating this assignment's type (via evaluateAssignmentFlowNode) causes us to enter the TypeEvaluator via evaluateTypesForStatement and eventually call back into getTypeForFlowNode for Assign@8415[5:13].

When, within that re-entrant call, we call getTypeFromLoopFlowNode for Loop@8412, we use the following cache entry:

type = str, isIncomplete = true, generationCount = 2, incompleteSubtypes = 
[type = str, isIncomplete = false, isPending = false, evaluationCount = 1]     # incompleteSubtype for Call@8411[5:18]
[type = undefined, isIncomplete = true, isPending = true, evaluationCount = 0] # incompleteSubtype for Assign@8416[6:17]

Since one of the incompleteSubtypes has isPending = true and cacheEntry.type is str, getTypeFromLoopFlowNode returns type = str, isIncomplete = true
str is then passed back through the TypeEvaluator to the FlowFlags.Assignment logic in getTypeFromFlowNode where it is saved as the cache entry on Assign@8416[6:17]

On the 2nd and 3rd times around the getTypeFromLoopFlowNode antecedent loop for Loop@8412, the cache entry now looks like this:

type = str, isIncomplete = true, incompleteSubtypes = 
[type = str, isIncomplete = false] # incompleteSubtype for Call@8411[5:18]
[type = str, isIncomplete = true]  # incompleteSubtype for Assign@8416[6:17]

We exit after hitting maxAttemptCount and the final result is the effectiveType of str.

After the fix for #4023, the events are slightly different:

Recall that when we go to evaluate the second antecedent of Loop@8407 after narrowing x via the assert, the cache entry for Loop@8412 looks like the following because we're still in the process of evaluating its first antecedent (Call@8411[5:18]) and we haven't looked at its second antecedent (Assign@8416[6:17]) at all yet:
```
type = undefined, isIncomplete = true, generationCount = 2, incompleteSubtypes =
[ type = undefined, isIncomplete = true, isPending = true, evaluationCount = 0 ] # incompleteSubtype for Call@8411[5:18]
```
Since we have only one incompleteSubtype but two antecedents, getTypeFromLoopFlowNode for Loop@8412 does not exit early. Instead, it evaluates the antecedents of Loop@8412 which causes us to call getTypeForFlowNode on Assign@8416[6:17] earlier than before.
Again we call into the TypeEvaluator to evaluate the type of the assignment expression, and we eventually call back into the CodeFlowEngine, calling getTypeForFlowNode for Assign@8415[5:13].

We move to its antecedents and call getTypeFromLoopFlowNode for Loop@8412, using the following cache entry. Note that although earlier we narrowed the declared type to str (via the assert), that information has not made it into the incompleteSubtypes for Loop@8412 yet:

type = undefined, isIncomplete = true, generationCount = 2, incompleteSubtypes =
[ type = undefined, isIncomplete = true, isPending = true, evaluationCount = 0 ] # incompleteSubtype for Call@8411[5:18]
[ type = undefined, isIncomplete = true, isPending = true, evaluationCount = 0 ] # incompleteSubtype for Assign@8416[6:17]

Since we have two incompleteSubtypes (one for each antecedent) and at least one has isPending = true, we return early. getTypeFromLoopFlowNode returns type = undefined, isIncomplete = true.
Since we returned type = undefined, getTypeOfName falls back to using the annotated type of x which is Union[str, None]. I'd love to link to that line of code, but TypeEvaluator.ts is too large for GitHub to handle nicely.
Union[str, None], gets passed back through the TypeEvaluator to the FlowFlags.Assignment logic in getTypeFromFlowNode where it is saved as the cache entry on Assign@8416[6:17].

We continue looping over the antecedents of Loop@8412 until hitting maxAttemptCount at which point the effectiveType is Union[str, None] because the cache entry is:

type = Union[str, None], isIncomplete = true, incompleteSubtypes =
[ type = undefined, isIncomplete = true ]        # incompleteSubtype for Call@8411[5:18]
[ type = Union[str, None], isIncomplete = true ] # incompleteSubtype for Assign@8416[6:17]

Recall that we were looking at Loop@8412 because it is the second antecedent of Loop@8407. The cache entry for Loop@8407 now looks like this, resulting in an effective type of Union[str, None] (str combined with Union[str, None]):

type = Union[str, None], isIncomplete = true, incompleteSubtypes =
[ type = str, isIncomplete = false ]              # incompleteSubtype for Call@8406[4:14]
[ type = Union[str, None], isIncomplete = false ] # incompleteSubtype for Circular(Loop@8412)

Union[str, None] is now passed back to Loop@8407 down the "top" path of the code flow graph resulting in the following cache entry for Loop@8412:

type = Union[str, None], isIncomplete = true, incompleteSubtypes =
[ type = Union[str, None], isIncomplete = false ] # incompleteSubtype for Call@8411[5:18]
[ type = Union[str, None], isIncomplete = true ]  # incompleteSubtype for Assign@8416[6:17]

Eventually we hit maxAttemptCount and grab the effectiveType of this data which is, of course, Union[str, None].

rchiodo commented 1 year ago

Could you try both algorithms and return the type that is more constrained between the two? Or does that still break the bug that was fixed in #4023?

debonte commented 1 year ago

Could you try both algorithms...

I'm concerned that that would make the CodeFlowEngine even harder to understand. It's already very complicated.

I haven't dug into #4023, so I'm not sure which of the behavior changes I mentioned above it is relying on. Maybe that's what I should do next.

There are a couple things that confuse me about #4023:

Given that on the return line we know that closest is Calculation | None, it's not clear to me how getTypeFromLoopFlowNode is connected with us evaluating the type of (closest.numerator, closest.denominator). I would think that we'd use the CodeFlowEngine at that point to tell us that closest is not None in the else block. But it wouldn't involve getTypeFromLoopFlowNode.
How is Eric's simplified repro case related to the user's repro case?

erictraut commented 1 year ago

No, we wouldn't use both algorithms. That would require not only duplicate code but also duplicate caches. We should find one algorithm that works.

I'll take a look at the problem when I have time. Thanks Erik for the above analysis and notes.

mehdigmira commented 1 year ago

I have a weird behaviour with nested loops as well:

class A:
    def foo(self):
        ...

def f(fn: Callable | None):
    d: dict[int, A] = {}
    for _ in range(1, 10):
        for _ in range(1, 10):
            if fn is not None:
                key = fn(
                    1
                )  # Object of type "None" cannot be called # but if i comment the `item.foo()` line this passes
                item = d[key]
                item.foo()

Is it the same issue ?

debonte commented 1 year ago

Is it the same issue ?

Yes, this regression was caused by the same change.

When evaluating the type of fn at the call, we see the is not None and attempt to use it for narrowing. However, getTypeFromFlowNode(conditionalFlowNode.antecedent) returns undefined so no narrowing occurs. Before #4023, we got Union[Callable, None] here and narrowed it to Callable.

I added reveal_type(fn) above the fn(1) call, and that gives the expected/narrowed type, but I'm not sure why.

erictraut commented 1 year ago

This will be addressed in the next release.

erictraut commented 1 year ago

This is included in pyright 1.1.286, which I just published. It will also be included in a future version of pylance.

microsoft / pyright

Variable becoming "| None" inside a nested loop #4040

Before the fix for #4023, the following events occurred:

After the fix for #4023, the events are slightly different: