pylint-dev / pylint

It's not just a linter that annoys you!
https://pylint.readthedocs.io/en/latest/
GNU General Public License v2.0
5.29k stars 1.13k forks source link

``duplicate-code`` takes over an hour for a large project even when disabled, if ``reports=yes`` #3443

Open Alphadelta14 opened 4 years ago

Alphadelta14 commented 4 years ago

In a directory with 500000 LOC of python files, we noticed pylint would hang for the entire directory. After investigating, it actually was just taking forever (60+ minutes) to compute similarities. We have similarities disabled for our project, and it was doing this regardless. After removing pylint/checkers/similarities.py, it took about 3 minutes to do all of the checks we wanted.

Steps to reproduce

  1. Find some project with 500000 lines
  2. Run pylint
  3. Wait an hour

Current behavior

It takes an hour to run similarities on a project, despite similarities being disabled.

Expected behavior

Similarities are not checked and do not consume so much time. Or they take a non-noticeable amount of time.

pylint --version output

$ pylint --version
Starting at 18:07:52
pylint3 2.3.1
astroid 2.2.5
Python 3.7.3 (default, Aug  8 2019, 00:00:00) 
[GCC 7.3.1 20180303 (Red Hat 7.3.1-5)]
(datarobot-6.0) Completed at 18:07:53
PCManticore commented 4 years ago

Can you post your pylintrc file? Testing with --disable=duplicate-code or --disable=similarities seems to work as the similarities check does not seem to be enabled any longer.

Alphadelta14 commented 4 years ago
disable=similarities,
    spelling,
    design,
    fixme,
    singleton-comparison,
    invalid-name,
    superfluous-parens,
    abstract-method,
    redefined-outer-name,
    no-init,
    old-style-class,
    redefined-builtin,
    useless-object-inheritance,
    useless-super-delegation,
    no-else-return,
    bad-continuation

My original command was: $ python3 -m pylint --disable=all --enable=F MM/

PCManticore commented 4 years ago

Thanks for the example, but I still cannot reproduce this.

Here's what I did:

# `process_module` nor `close()` were called here
$ pylint --disable=all --enable=F M/

# Again those methods were not called as `similarities` has been disabled in the config file
$ pylint --rcfile=pylintrc M/

# Removed the config and ran the following again. `similarities` check wasn't triggered as well, because `duplicate-code` was disabled via `--disable=all`
$ pylint --disable=all --enable=F M/

So unless there is something super obvious that I'm missing, I don't know how exactly SimilarChecker got enabled so that it started computing the duplication of code. I think you should do a similar investigation and check why similarities is enabled. Maybe pylint does not find the proper configuration file? Maybe it gets enabled via a different method. But on my side I can't reproduce this at this moment.

Alphadelta14 commented 4 years ago

Thanks for taking your time for this one. I found out it was triggered by having reports=yes in my pylintrc.

Running with python3 -m pylint --disable=all,RP0801 --enable=F MM/ does not trigger similarities anymore. (Adding RP0801 to my disabled= section in pylintrc also works as expected)

PCManticore commented 4 years ago

Wow, that's a stupid bug then. Thank you for the additional details!