Open zeloras opened 2 years ago
@Pierre-Sassoulas This is actually a tricky bug to resolve. https://github.com/PyCQA/pylint/pull/5201 normalises \
to become OS-agnostic. However, \
is also the character used to escape characters in regex.
So, .*/migrations/.*.py,.*/urls.py
should work but I think users rightly expect \
to be the escape character. Do you know any regex trickery to allow both swapping \
to /
for Windows but also allowing escaping characters?
What if we "normalize" the regex only if the current OS is actually windows ? I don't like making OS specific code but this feel justified here.
Hmm, then users on windows can't use the \
to escape. They might copy the pylintrc
from somebody who wrote it on Linux and then get a different result from the same pylintrc
I don't think that works.
Could we add a line to the description of the setting and say that escaping is not allowed since \
is considered a directory break?
Could we use glob
instead of regex ? I'm not sure it's really equivalent. glob could handle the hard part and we could point users to the glob documentation.
The problem is actually in the validator: https://github.com/PyCQA/pylint/blob/e33bc6377bbbf6135888cdadecdfc3baa8497a24/pylint/config/option.py#L30-L40
pathlib.PureWindowsPath('.*\\ignore\\.*').as_posix()
'.*/ignore/.*'
pathlib.PureWindowsPath('.*\ignore\.*').as_posix()
'.*/ignore/.*'
For some reason there is no difference here. Even \\\\
returns the same. I don't think changing to glob
is the right move as then we would invalidate all current configs using ignore-paths
. That seems like a large breaking change.
What do you think of updating the description and saying that the escape character can't be used as it represent the dir-delimiter on Windows? It is also not really a nice solution, but I don't know a better way forward..
I think we should revert the windows specific part until we find a non breaking change way of handling this properly. Most of our user are on Linux according to pypi stats
Wouldn't it be better to disallow the use of the escape character? How often are you actually using the escape character in a regex for paths anyway? In the regex used in the OP the escape character is unnecessary as .*/migrations/.*.py,.*/urls.py
would work as well. The only thing I can think of is using $
in a directory or file name, but I don't think that is that uncommon. Personally I think supporting windows serves a large user base than the user base that actually needs the escape character.
supporting windows serves a large user base than the user base that actually needs the escape character
You're right, and on top of that we can now raise a proper configuration parsing error if the regex contain a forbidden character so it's going to be relatively easy to warn users. The real fix still eludes me though. At least if we don't want to do any breaking changes.
You're right, and on top of that we can now raise a proper configuration parsing error if the regex contain a forbidden character so it's going to be relatively easy to warn users.
Not sure that is so easy. Warning on \
doesn't make sense, so we could only warn on $
and ^
I guess? Perhaps we should make the description of the setting very clear and hope people don't run into this too much 😅
The real fix still eludes me though. At least if we don't want to do any breaking changes.
I don't have a good solution as well sadly..
@zeloras With regards to your initial question: we have reached the conclusion that we are going to disallow \
as escape character for the ignore-paths
setting as the \
is needed as delimiter in standard windows paths. I'm going to update the description of the setting accordingly. I know this might not fully satisfy your needs, but we need to make a compromise here (you can read the preceding comments to see what the problem is for us).
In your case updating the settings file to include ignore-paths=.*/migrations/.*.py,.*/urls.py
should make the setting work like you expect it to.
For future reference:
The setting description has been updated with #5415. This issue is kept to think of a better and permanent solution to allow both \
as escaping character and normalise paths to POSIX and Windows systems as introduced in #5201.
a workaround is to use pyproject.toml
instead of .pylintrc
, and use raw strings ('
instead of "
):
[tool.pylint.MASTER]
ignore-patterns = '.*\.pyi'
Could we add a line to the description of the setting and say that escaping is not allowed since
\
is considered a directory break?
I think any option that modifies or limits regex patterns (like replacing all \
s) is a bad idea, as that will then confuse or frustrate people who are familiar with regular expressions, and used to being able to use all regex features. \
is useful in this case both for escaping chars that are allowed in filenames bug have special meaning in regex (like .
, (
, +
, and [
), but also for character classes (like \d
(digit), \w
(alphanumeric plus underscore), \S
any char except whitespace) and unicode chars (like \N{EM DASH}
). There are more features using \
, and plenty of examples in https://docs.python.org/3/library/re.html
From just looking at this issue, before reading the documentation, I though that --ignore-paths
was intended to be paths (not regex patterns), and --ignore-patterns
was the same but using patterns. Even looking at the documentation, it is difficult to find anything about setting the same configuration using command line vs pyproject.toml
and .pylintrc
. I would have expected that to be under configuration, similar to how it is in the mypy documentation.
I also think these names would be much clearer:
--ignore-paths
= paths only, not regex patterns
--ignore-path-regex
= regex patterns matching paths
--ignore-filenames
= file names only, not paths or patterns
--ignore-filename-regex
= regex patterns matching filenames
When the string is not a regex pattern, it should be fine to normalize \
to /
, but it is not ok to do that for regular expressions.
To work with windows paths, --ignore-path-regex
should use the regex exactly as supplied, and test the same regex on both the windows style and the posix style paths. So the file C:\Projects\foo\bar\baz-1.py
would be tested against the regex as python strings "C:\\Projects\\foo\\bar\\baz-1.py"
(would match pattern like \\[\w-]+\.py
) and "C:/Projects/foo/bar/baz-1.py"
(would match pattern like /[\w-]+\.py
).
Pattern: foo\d{4}-bar.py
would match foo0000-bar.py
and foo1234-barxpy
but not foo\dddd-bar.py
. To best match something like that, use foo\\d{4}-bar\.py
It would also be useful to have more examples in the documentation, to avoid ambiguities such as whether basename includes the extension (https://docs.python.org/3/library/os.path.html#os.path.basename) or not (https://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/FilenameUtils.html#getBaseName-java.lang.String- ), and whether ,
is part of the filename or separating 2 filenames.
In the regex used in the OP the escape character is unnecessary as
.*/migrations/.*.py,.*/urls.py
would work as well.
But that would also match apps/notifications/migrations/foopy
(even though it contains no .
)
If we are going to rename configuration options we might as well also consider using glob
instead of regex..
The code that tests this is fairly straightforward and I'm very open to reviewing a PR that "fixes" this or at least improves the current situation. I won't have much time to work on this myself sadly.
I agree with @Zeckie that limiting regex's escaping functionality is not a good idea.
In your case updating the settings file to include
ignore-paths=.*/migrations/.*.py,.*/urls.py
should make the setting work like you expect it to.
.*/urls.py
is not the same as .*/urls\.py
- .*/urls.py
matches foo/urls-py
too.
Bug description
Good day/morning/evening!
Today i've update pylint to version 2.12.1 and for now that version ignoring ignore-paths=./migrations/..py, in my .pylintrc config. But in version 2.11.1 everything is working correct
Configuration
Command used
Pylint output
Expected behavior
Your code has been rated at 10.00/10
Pylint version
OS / Environment
Linux d507ea5a5499 5.14.17-301.fc35.x86_64 #1 SMP Mon Nov 8 13:57:43 UTC 2021 x86_64 GNU/Linux
Additional dependencies
No response