pylint-dev / pylint

It's not just a linter that annoys you!
https://pylint.readthedocs.io/en/latest/
GNU General Public License v2.0
5.29k stars 1.13k forks source link

Inconsistent behavior of `--recursive=y` depending on folder structure name #9188

Open KristianTashkov opened 12 months ago

KristianTashkov commented 12 months ago

Bug description

You get different results from pylint when the folders are called dataset vs train.

On a clean environment, if you run:

mkdir pylint_test
cd pylint_test
mkdir dataset
touch dataset/__init__.py
mkdir dataset_123
echo "# pylint: disable=missing-docstring\nfrom collections import Counter" > dataset_123/a.py
PYTHONPATH=. pylint --recursive=y .

mv dataset train
mv dataset_123 train_123
PYTHONPATH=. pylint --recursive=y . 

Configuration

No response

Command used

pylint --recursive=y .

Pylint output

First execution has empty output.

Second execution has the expected:

************* Module a
train_123/a.py:2:0: W0611: Unused Counter imported from collections (unused-import)

------------------------------------------------------------------
Your code has been rated at 0.00/10 (previous run: 0.00/10, +0.00)

Expected behavior

Regardless of the folder name, expect both versions to return the same output.

Pylint version

pylint 3.0.2
astroid 3.0.1
Python 3.11.5 (main, Sep 11 2023, 08:31:25) [Clang 14.0.6 ]

OS / Environment

Apple Mac M1

Additional dependencies

No response

istrandjev commented 12 months ago

I have asked the question on stackoverflow and a user speculates that a particular line in dicover files is causing the issue https://stackoverflow.com/q/77357606/812912

istrandjev commented 11 months ago

I did a local test with fixing the order of files returned by os.walk (sorted(..., reverse=True)) this guarantees that if we have two directories, one a prefix of the other (e.g. dataset and dataset_utils), then the latter will come first (i.e. dataset_utils comes before dataset). Doing that it seems pylint now behaves as expected. I am not suggesting this should be the actual fix, just providing it as information for easier diagnosis

Pierre-Sassoulas commented 11 months ago

If it sounds like the actual fix, swim like the actual fix and quack like the actual fix (and modify a single line of code), it's probably the actual fix 😄 Do you want to open a PR to be credited for it @istrandjev ?

istrandjev commented 11 months ago

I agree! Unfortunately it looks like a workaround, quacks like an ugly hack and it is extremely obscure why this order matters and how it fixes problems. For sure there is a "right" way to do this but it will require a bit more investigation. I would try to do but not sure about timeline at this point