leinardi / mypy-pycharm

A plugin providing both real-time and on-demand scanning of Python files with Mypy from within PyCharm/IDEA.
Apache License 2.0
194 stars 31 forks source link

"Check project" 10× slower than running mypy CLI on all files instead #114

Open Feuermurmel opened 12 months ago

Feuermurmel commented 12 months ago

Step 1: Are you in the right place?

Step 2: Describe your environment

Step 3: Describe the problem:

Steps to reproduce:

I have a large-ish Django project (~600 files). We're using the django-stubs mypy plugin. When I run mypy on the whole project from the command line, it takes about 13 seconds to complete:

(venv) cs ~/.../Projects/captain$ time mypy captain/**/*.py
[...]
Found 7 errors in 3 files (checked 617 source files)

real    0m13.995s
user    0m12.252s
sys 0m1.457s

But when I run the Check Project action from the Mypy tool window in IntelliJ IDEA, I measured 113 seconds to complete. This seems like a too big difference, so I think something fishy is going on (e.g. that a separate mypy process is launched per file instead of for all files in a single call).

Expected Results:

I would expect the Check Project action to take about as long as running mypy from the command line, e.g. what I get when I either run mypy or mypy **/*.py or similar.

Relevant Code:

$ cat setup.cfg
[mypy]
warn_redundant_casts = True
warn_unused_ignores = True
warn_unreachable = True
extra_checks = True
strict_equality = True
files = captain
plugins =
    mypy_django_plugin.main

[mypy.plugins.django-stubs]
django_settings_module = "captain.project.settings"
$ pip freeze | grep -E 'django|mypy'
django-stubs==4.2.4
django-stubs-ext==4.2.2
mypy==1.5.1
mypy-extensions==1.0.0
pytest-django==4.5.2
Feuermurmel commented 12 months ago

I just checked. The plugin does call mypy multiple times. In my project it's 74 time in total when running Check Project once. (Probably) due to the mypy plugin we're using, the startup time of mypy is quite long, so this adds up.

It seems that first, mypy is called for each __init__.py in the project individually, and then once of all remaining files. Is there an explanation for this behavior?

leinardi commented 12 months ago

Is there an explanation for this behavior?

Yes: the plugin get the list of files that belong to a project from the Jetbrains API, this takes care of handling the files that you ignore on your IDE. There is no difference in the implementation to scan a single file, a module or the entire project, it's always a list o files that is populated by the Jetbrains API.

Feuermurmel commented 12 months ago

Would it be possible to gather all files to scan in a list and then invoke mypy just once with that list?

leinardi commented 12 months ago

That's what I tried as first thing but, unfortunately, there is a maximum number of characters that you can send as a parameter and it was easily reached with non trivial projects.

Feuermurmel commented 12 months ago

I can imagine that to become a problem quickly. There's a workaround: The list of paths to check can be passed via a file containing the paths. See Running mypy and managing imports (Reading a list of files from a file). It's supported even by very old versions of mypy: https://github.com/python/mypy/commit/2fbb7240d162802777acc3a62ce4c5682dc4419e

I think I'd try to solve this problem by first trying to put all paths on the command line and, if that fails, pass the the paths via a temporary file. At least on Linux and macOS, creating a process will fail with E2BIG, when the argument list is too long (execve() (ERRORS)). Or you could have a static limit of a few kB, and switch approaches based on that.

leinardi commented 12 months ago

Yeah it's an interesting solution, but unfortunately I do not have time to implement new features for this plugin anymore. Pull requests are welcome.