Open hauntsaninja opened 1 month ago
If this is accurate, maybe the fscache exception handling is really slowing us down in the mypyc build.
mypyc: native
interpreted: interpreted
mypy -v
produces details about processed files, and this seems important:
LOG: Processing SCC of size 945 (torch.onnx._globals torch._inductor.exc torch._inductor.runtime.hi
nts torch.utils._traceback torch.utils._sympy.functions ... <long output snipped>
Mypy detects an import cycle with 945 modules.
Overall 1380 files were parsed, so 68% of processed files are in this one SCC. I've seen this pattern in other third-party packages as well -- the majority of the implementation is a single SCC.
A potential way to make the SCC smaller would be to process imports lazily in third-party modules (where this is possible, since errors aren't reported). It may be tricky to implement though, but I'll think about it more.
Yeah, lazy import resolution could be a massive perf win
https://github.com/python/mypy/issues/17924 is the issue for tracking lazy resolution
Jukka's times in https://github.com/python/mypy/pull/17920#issuecomment-2406966926 are much better than mine. https://github.com/python/mypy/issues/17948 is the issue for tracking performance improvements in my work environment.
Performance is now a lot better, but I bet there are still some good opportunities to make this faster. Fresh CPU profiles would be interesting to see.
Here's a new profile for 53134979c !
Install torch, along with a few extra dependencies:
rm -rf torchenv
python -m venv torchenv
uv pip install --python torchenv/bin/python torch matplotlib onnx optree types-redis --exclude-newer 2024-10-29
Then I get the following on Python 3.11:
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_53134979c/venv/bin/mypy -c "import torch" --python-executable=torchenv/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_53134979c/venv/bin/mypy -c "import torch" --python-executable=torchenv/bin/python --no-incremental
Time (mean ± σ): 27.210 s ± 0.194 s [User: 25.506 s, System: 1.684 s]
Range (min … max): 27.052 s … 27.426 s 3 run
Here's the output of:
py-spy record --native -- /tmp/mypy_primer/timer_mypy_53134979c/venv/bin/python -m mypy -c "import torch" --no-incremental --python-executable torchenv/bin/python
(I realised py-spy also supports --format speedscope
which can be nicer, but is harder to just link on Github)
We use a lot of torch at work, performance is probably the biggest reason folks at work switch to a different type checker.