python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.57k stars 2.85k forks source link

Overloads in generic classes are extremely slow when using generic self #18022

Open alexdrydew opened 1 month ago

alexdrydew commented 1 month ago

Bug Report

Performance of mypy degrades drastically with the number of type variable constraints when checking overloaded method of generic class with overloaded self.

To Reproduce

from typing import TypeVar, Generic, TypeAlias, overload, Any

T1 = TypeVar('T1', int, str, bool)
T2 = TypeVar('T2', int, str, bool)
T3 = TypeVar('T3', int, str, bool)
T4 = TypeVar('T4', int, str, bool)
T5 = TypeVar('T5', int, str, bool)
T6 = TypeVar('T6', int, str, bool)
T7 = TypeVar('T7', int, str, bool)

class A(Generic[T1, T2, T3, T4, T5, T6, T7]):
    @overload
    def foo(self: A[int, T2, T3, T4, T5, T6, T7]) -> int:
        ...

    @overload
    def foo(self: A[str, T2, T3, T4, T5, T6, T7]) -> str:
        ...

    def foo(self) -> int | str:
        raise NotImplementedError

https://mypy-play.net/?mypy=latest&python=3.12&gist=fd939322936206cbb9b1f45a71fbd9b7

Expected Behavior

Code typechecks in realistic time. For reference pyrights checks such code without significant performance issues

Actual Behavior

mypy-play.net times out. In my experience it takes at least more than 10 minutes to check such code. Interestingly, when converting foo method to an independent function taking A as a parameter typechecking time drops significantly.

Your Environment

python3.11 and python3.12, both mypy 1.12.1 and 1.13

brianschubert commented 1 month ago

Looks related to #14978 (Numpy uses generic self parameters for its operator overloads)

JukkaL commented 1 month ago

It looks like there is some exponential behavior. If I reduce the number of type variables by two, type checking completes in about 6 seconds (using interpreted mypy). Would anybody interested in investigating this? The next step would be to collect a CPU profile and debug the hottest functions to understand what is going on. Once we understand the exponential behavior, a fix would hopefully be relatively easy.

Here is an example that completes type checking in a reasonable time but is still unexpectedly slow:

from typing import TypeVar, Generic, TypeAlias, overload, Any

T1 = TypeVar('T1', int, str, bool)
T2 = TypeVar('T2', int, str, bool)
T3 = TypeVar('T3', int, str, bool)
T4 = TypeVar('T4', int, str, bool)
T5 = TypeVar('T5', int, str, bool)

class A(Generic[T1, T2, T3, T4, T5]):
    @overload
    def foo(self: A[int, T2, T3, T4, T5]) -> int:
        ...

    @overload
    def foo(self: A[str, T2, T3, T4, T5]) -> str:
        ...

    def foo(self) -> int | str:
        raise NotImplementedError
alexdrydew commented 1 month ago

py-spy profiling results with 6 type arguments looks like this:

profile

JukkaL commented 1 month ago

Thanks! Clearly check_overlapping_overloads is a bottleneck. Maybe it has some exponential behavior that we can fix.

JukkaL commented 1 month ago

Based on the profile, my guess is that #14978 is a different issue. This seems related to processing the overload definition, whereas #14978 is related to type checking calls to an overloaded function.