python / mypy

Optional static typing for Python
https://www.mypy-lang.org/
Other
18.49k stars 2.83k forks source link

mypy is slow when type checking torch #17919

Open hauntsaninja opened 1 month ago

hauntsaninja commented 1 month ago
λ mypy --version          
mypy 1.11.2 (compiled: yes)

λ uv pip show torch       
Using Python 3.11.8 environment at /Users/shantanu/.virtualenvs/openai-wfht
Name: torch
Version: 2.1.0
Location: /Users/shantanu/.virtualenvs/openai-wfht/lib/python3.11/site-packages
Requires: filelock, fsspec, jinja2, networkx, sympy, typing-extensions
Required-by: ...

λ time mypy -c 'import torch' --no-incremental
Success: no issues found in 1 source file
mypy -c 'import torch' --no-incremental  33.09s user 2.73s system 98% cpu 36.391 total

λ time mypy -c 'import torch'
Success: no issues found in 1 source file
mypy -c 'import torch'  6.24s user 0.88s system 95% cpu 7.454 total

We use a lot of torch at work, performance is probably the biggest reason folks at work switch to a different type checker.

hauntsaninja commented 1 month ago

If this is accurate, maybe the fscache exception handling is really slowing us down in the mypyc build.

mypyc: native

interpreted: interpreted

JukkaL commented 1 month ago

mypy -v produces details about processed files, and this seems important:

LOG:  Processing SCC of size 945 (torch.onnx._globals torch._inductor.exc torch._inductor.runtime.hi
nts torch.utils._traceback torch.utils._sympy.functions ... <long output snipped>

Mypy detects an import cycle with 945 modules.

Overall 1380 files were parsed, so 68% of processed files are in this one SCC. I've seen this pattern in other third-party packages as well -- the majority of the implementation is a single SCC.

A potential way to make the SCC smaller would be to process imports lazily in third-party modules (where this is possible, since errors aren't reported). It may be tricky to implement though, but I'll think about it more.

hauntsaninja commented 1 month ago

Yeah, lazy import resolution could be a massive perf win

hauntsaninja commented 1 month ago

https://github.com/python/mypy/issues/17924 is the issue for tracking lazy resolution

Jukka's times in https://github.com/python/mypy/pull/17920#issuecomment-2406966926 are much better than mine. https://github.com/python/mypy/issues/17948 is the issue for tracking performance improvements in my work environment.

JukkaL commented 2 weeks ago

Performance is now a lot better, but I bet there are still some good opportunities to make this faster. Fresh CPU profiles would be interesting to see.

hauntsaninja commented 2 weeks ago

Here's a new profile for 53134979c !

Install torch, along with a few extra dependencies:

rm -rf torchenv
python -m venv torchenv
uv pip install --python torchenv/bin/python torch matplotlib onnx optree types-redis --exclude-newer 2024-10-29

Then I get the following on Python 3.11:

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_53134979c/venv/bin/mypy -c "import torch" --python-executable=torchenv/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_53134979c/venv/bin/mypy -c "import torch" --python-executable=torchenv/bin/python --no-incremental
  Time (mean ± σ):     27.210 s ±  0.194 s    [User: 25.506 s, System: 1.684 s]
  Range (min … max):   27.052 s … 27.426 s    3 run

Here's the output of:

py-spy record --native -- /tmp/mypy_primer/timer_mypy_53134979c/venv/bin/python -m mypy -c "import torch" --no-incremental --python-executable torchenv/bin/python

Flamegraph

(I realised py-spy also supports --format speedscope which can be nicer, but is harder to just link on Github)