Open brandon-leapyear opened 2 years ago
:sparkles: This is an old work account. Please reference @brandonchinn178 for all future communication :sparkles:
~Update: the minimal repro might be fast with the version of pylint on master
, so this might not be an issue anymore.~
Never mind, forgot to install pandas
. When pandas
is installed, the minimal repro is still slow using the version of pylint on main
.
Related, will there be a pylint release anytime soon?
Hi @brandon-leapyear thank you for opening the issue. The next milestone for pylint is https://github.com/PyCQA/pylint/milestone/49, it's 89% done right now, we need a release of astroid in order to close it, it's here https://github.com/PyCQA/astroid/milestone/25, 70% done right now.
I'm able to reproduce this, though not at 8s
time pylint --disable=all foo.py
real 0m5.106s
user 0m5.353s
sys 0m0.341s
5s is still quite shocking so I agree this issue is worth having. Though I wonder how to even tackle an issue like this?
For curiosity, I ran time pylint test.py
(so not disabling, just enabling whatever checks are enabled via config) and I"m getting something pretty similar
time pylint test.py
real 0m5.377s
user 0m5.536s
sys 0m0.388s
Then I remove the pandas usage in the file and the time comes down significantly on both enabled checks and disabling all checks
real 0m0.886s
user 0m0.653s
sys 0m0.113s
So to me this is an issue related to pylint and pandas
not related to disabling checks.
Though I wonder how to even tackle an issue like this?
There's a documentation about performance for contributor here : https://pylint.pycqa.org/en/latest/development_guide/contributor_guide/profiling.html
Cool. An investigation with a profiler is definitely needed here to get some data on what's going on!
@clavedeluna There's another profiler called Yappi that I've used to great effect in Pylint profiling. I wrote up some instructions for it here: https://nickdrozd.github.io/2022/04/12/performance-hot-spots.html
I also came across of https://github.com/bloomberg/pytest-memray recently and wanted to check what it can do for pylint. I did not try anything yet.
Bug description
Repro:
mkdir test && cd test
python3 -m venv venv
venv/bin/pip install pylint pandas
Write
foo.py
:time pylint --disable=all foo.py
And this consistently takes 8s to run on my machine. Doing any of the following brings the runtime down to <2s:
dataclasses
importpandas
import@dataclass
decorator@foo
instead of@dataclass
@dataclasses.dataclass
instead of@dataclass
int
instead ofpandas.Series
Optional[DataFrame]
instead ofDataFrame
pandas
from the environmentConfiguration
No response
Command used
Pylint output
Expected behavior
Disabling all checks should not take 8 seconds for this small file.
Pylint version
OS / Environment
OSX 10.15 (Catalina)
Additional dependencies
astroid==2.9.3 isort==5.10.1 lazy-object-proxy==1.7.1 mccabe==0.6.1 numpy==1.22.2 pandas==1.4.1 platformdirs==2.5.1 pylint==2.12.2 python-dateutil==2.8.2 pytz==2021.3 six==1.16.0 toml==0.10.2 typing_extensions==4.1.1 wrapt==1.13.3