astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.
https://docs.astral.sh/ruff
MIT License
32.19k stars 1.07k forks source link

Audit how we determine whether a file is a "Python source file" #13691

Open AlexWaygood opened 2 weeks ago

AlexWaygood commented 2 weeks ago

I noticed in #13682 that there's some inconsistency regarding how we determine whether a file is a "Python source file" currently. In the code for ruff server (and the red-knot port of the server), we take care to do case-insensitive matching when figuring out whether something is a notebook file or not:

https://github.com/astral-sh/ruff/blob/5b4afd30caebd6e2c5c27bbf5debc1663cbb28a2/crates/ruff_server/src/session/index.rs#L124-L126

https://github.com/astral-sh/ruff/blob/5b4afd30caebd6e2c5c27bbf5debc1663cbb28a2/crates/red_knot_server/src/session/index.rs#L75-L79

Elsewhere, however, we mostly use case-sensitive matching:

https://github.com/astral-sh/ruff/blob/5b4afd30caebd6e2c5c27bbf5debc1663cbb28a2/crates/ruff_python_ast/src/lib.rs#L91-L101

https://github.com/astral-sh/ruff/blob/5b4afd30caebd6e2c5c27bbf5debc1663cbb28a2/crates/red_knot_python_semantic/src/module_resolver/path.rs#L41-L60

For places like the red-knot module resolver, it's likely correct to do case-sensitive matching (Python recognises foo.py as an importable module, but not foo.PY), but in other places it may not be. We should audit the code to make sure we're using case-sensitive matching and case-insensitive matching for file extensions in the correct places. We should also add comments to the places where there might be a subtle reason why case-(in)sensitive matching is required, rather than vice versa.

MichaReiser commented 2 weeks ago

For places like the red-knot module resolver, it's likely correct to do case-sensitive matching (Python recognises foo.py as an importable module, but not foo.PY

Note: I think that depends on the file system's case sensitivity.

Other than that. I think we should ignore casing when matching files because that's what most desktop environments to. They use the same application to open a test.jpg or test.JPG.

AlexWaygood commented 2 weeks ago

Here, for example, we might want to continue to do case-sensitive matching, since the rule only applies to the names of Python modules that are intended to be importable by other Python modules: https://github.com/astral-sh/ruff/blob/5b4afd30caebd6e2c5c27bbf5debc1663cbb28a2/crates/ruff_linter/src/rules/pep8_naming/rules/invalid_module_name.rs#L51-L59

If a FOO.PY module can't be imported due to the fact that it uses a .PY extension on a case-sensitive file system, then the name of the file isn't really relevant to any PEP-8 concerns, as it's not a module (even if it might be a Python file). The rule is invalid-module-name, not invalid-python-file-name.

dhruvmanila commented 2 weeks ago

There's also this issue https://github.com/astral-sh/ruff-vscode/issues/574 related to the server although not sure if this affects that.

hauntsaninja commented 5 days ago

Fwiw mypy has some special handling in a few places to be case sensitive even on case insensitive file systems