python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
17.87k stars 2.74k forks source link

segfault on recursive generic protocols of certain form #17326

Open randolf-scholz opened 1 month ago

randolf-scholz commented 1 month ago

Crash Report

Not sure how to title this, the following example produces a segfault, I simplified it from more complicated code. It seems to be multicausal, removing the Literal makes it disappear, so does removing the __or__ operator on Transform.

To Reproduce

https://mypy-play.net/?mypy=latest&python=3.12&gist=2409c60fc1d5734569f6ab55344a85f0

from typing import Any, Literal, Protocol, TypeVar, overload

X = TypeVar("X")
Y = TypeVar("Y")
Z = TypeVar("Z")
X2 = TypeVar("X2")
Y2 = TypeVar("Y2")

class Transform(Protocol[X, Y]):
    def __or__(
        self, other: "Transform[X2, Y2]", /
    ) -> "Transform[tuple[X, X2], tuple[Y, Y2]]": ...

@overload  # 1 arg
def chain_encoders(
    e: Transform[X, Y], /, *, simplify: Literal[True] = ...
) -> Transform[X, Y]: ...
@overload  # ≥2 args
def chain_encoders(
    *es: *tuple[Transform[Any, Y], *tuple[Transform, ...], Transform[X, Any]],
    simplify: Literal[True] = ...,
) -> Transform[X, Y]: ...
def chain_encoders(*encoders: Transform, simplify: bool = True) -> Transform:
    r"""Chain encoders."""
randolf-scholz commented 1 month ago

Maybe related to #17319, since I also do tuple unpacking.

AlexWaygood commented 1 month ago

Here's a slightly more minimal repro that doesn't involve Literal or Any:

from __future__ import annotations
from typing import Protocol, TypeVar, overload

X = TypeVar("X")
Y = TypeVar("Y")
X2 = TypeVar("X2")
Y2 = TypeVar("Y2")

class Transform(Protocol[X, Y]):
    def __or__(self, other: Transform[X2, Y2]) -> Transform[tuple[X, X2], tuple[Y, Y2]]: ...

@overload
def chain_encoders(e: Transform[X, Y], simplify: bool) -> Transform[X, Y]: ...
@overload
def chain_encoders(*es: *tuple[*tuple[Transform, ...], Transform[X, object]]) -> Transform[X, Y]: ...

The cause is a recursion error; recursion errors cause segfaults if the code has been compiled with mypyc. Here's the last part of the traceback if you use an uncompiled version of mypy from the master branch:

  File "/Users/alexw/dev/mypy/mypy/subtypes.py", line 180, in is_subtype
    return _is_subtype(left, right, subtype_context, proper_subtype=False)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/subtypes.py", line 353, in _is_subtype
    return left.accept(SubtypeVisitor(orig_right, subtype_context, proper_subtype))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/types.py", line 1966, in accept
    return visitor.visit_callable_type(self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/subtypes.py", line 711, in visit_callable_type
    return is_callable_compatible(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/subtypes.py", line 1550, in is_callable_compatible
    if not ignore_return and not is_compat_return(left.ret_type, right.ret_type):
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/subtypes.py", line 415, in _is_subtype
    return is_subtype(left, right, subtype_context=self.subtype_context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/subtypes.py", line 180, in is_subtype
    return _is_subtype(left, right, subtype_context, proper_subtype=False)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/subtypes.py", line 353, in _is_subtype
    return left.accept(SubtypeVisitor(orig_right, subtype_context, proper_subtype))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/types.py", line 1422, in accept
    return visitor.visit_instance(self)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/subtypes.py", line 607, in visit_instance
    if right.type.is_protocol and is_protocol_implementation(
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/subtypes.py", line 1147, in is_protocol_implementation
    if l == left and r == right:
                     ^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/types.py", line 1434, in __eq__
    and self.args == other.args
        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/types.py", line 2406, in __eq__
    return self.items == other.items and self.partial_fallback == other.partial_fallback
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/types.py", line 2406, in __eq__
    return self.items == other.items and self.partial_fallback == other.partial_fallback
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/alexw/dev/mypy/mypy/types.py", line 2406, in __eq__
    return self.items == other.items and self.partial_fallback == other.partial_fallback
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  [Previous line repeated 370 more times]
RecursionError: maximum recursion depth exceeded
AlexWaygood commented 1 month ago

An appropriate issue title might be something like "segfault on recursive protocol used in overloaded function that also uses PEP-646"

ilevkivskyi commented 1 month ago

This has actually nothing to do with overloads or tuples. The real problem is diverging recursive protocols. These are very similar to diverging recursive aliases (that are plain prohibited at semanal level), even though they (theoretically) may have real use cases (e.g. an exercise: represent a non-mixed union like list[T] | list[list[T]] | list[list[list[T]]] | ...).

Sorry for a detour, the real minimal repro is:

[case testDivergingProtocol]
from typing import Protocol, TypeVar, List

T = TypeVar("T")
class P(Protocol[T]):
    def meth(self) -> P[List[T]]: ...

class C:
    def meth(self) -> C: ...

x: P = C()
[builtins fixtures/tuple.pyi]

FWIW generic types like type A[T] = <something with A[Other[T]]>, where Other is an arbitrary non-identity (technically maybe non-linear, like not representable as a union with T) type constructor, are highly problematic. And not just in mypy, to the best of my knowledge, the only existing "algorithm" to handle them is finite nesting expansion.

TBH I am not sure what to do. We may just prohibit such protocols, and require them to be nominal classes.

randolf-scholz commented 4 weeks ago

For context, the __or__ method here is simply the binary Cartesian product of functions. All I am saying here is if one has a Transform $f$ from $X$ to $Y$ and a Transform $g$ from $X_2 → Y_2$, then f | g (as notational substitute for $f×g$) is a Transform from $(X×X_2)$ to $(Y×Y_2)$.

randolf-scholz commented 4 weeks ago
class P(Protocol[T]):
    def meth(self) -> P[List[T]]: ...

class C:
    def meth(self) -> C: ...

Hm, I think I see the issue, when trying to test C <: P[T], the condition on .meth leads to C <: P[T] ⟺ C <: P[list[T]] ⟺ C <: P[list[list[T]]] ⟺ ..., an infinite regress. However, there are no additional conditions here, so shouldn't T=Unknown be a trivial solution? That's what pyright reports: Code sample in pyright playground

from typing import Protocol

class P[T](Protocol):
    def meth(self) -> "P[list[T]]": ...

class C:
    def meth(self) -> "C":
        return self

def as_p[T](x: P[T]) -> P[T]:
    return x

reveal_type(as_p(C()))  # P[Unknown]
ilevkivskyi commented 4 weeks ago

That's what pyright reports

Well, what do you think

export const maxTypeRecursionCount = 20;

mean?