Human-friendly rule names

not-my-profile commented 1 year ago

Pylint, ESLint and Clippy all let you enable or disable lints via human-friendly names e.g. Pylint lets you configure:

disable= wildcard-import,
 method-hidden,
 too-many-lines

The idea is to also support such human-friendly names for ruff in pyproject.toml, while still somehow supporting the numeric codes for backwards compatibility. The reasoning being that symbolic names are much more human-friendly than cryptic numeric codes. Case in point projects that care about readable configs end up adding comments to every numeric code in their config: zulip, fastapi, sphinx

Currently Ruff requires every rule to have a numeric code, which I don't think makes much sense given that Ruff also has Ruff-specific rules ... and I don't think we should have to introduce new cryptic numeric codes when we introduce a new Ruff-specific rule. This obviously requires quite some changes to the codebase/UI but I think doing this is worth it.

In particular I would like us to follow the Rust naming convention for lints:

the lint name should make sense when read as "allow lint-name" or "allow lint-name items"

I think this is a very nice convention because it results in descriptive names. And currently many of our lint names are not descriptive, e.g. TrueFalseComparison should probably rather be called EqualityComperatorForTrueOrFalse, UnnecessaryMap should probably rather be called MapWithLambdaInsteadOfComprehension.

(Previous discussion: #967)

[x] #2902

charliermarsh commented 1 year ago

I agree with this and I want to do it! Will reply in a bit with some more thoughts.

charliermarsh commented 1 year ago

Okay, so like I said -- I agree! And I think there's a lot of appetite for this (I know I've heard it from @andersk and @smackesey to name a few). We should definitely do it.

A few comments:

I do feel strongly about retaining the numeric codes, even just for the sake of retaining Flake8 compatibility (I wrote a bit on that point here -- it's been critical for adoption), but IMO that in no way conflicts with adding and preferring human-friendly lint names. Of course, in some ways, we already have these names defined in the code as the Violation structs, but they're not exposed to the user at all, and the names aren't very good or consistent.
I do think we should give some thought to the constraints we'd like to introduce around names. For example, I don't know if Clippy has a formal convention here, but from a very informal skim, case_sensitive_file_extension_comparisons looks like the longest name. We probably want to have some reasonable cap on length.
We need to think on the mechanics of this from a configuration perspective (I'm referring here to the pyproject.toml, CLI arguments, and in-code directives (# noqa: ...)). The most basic proposal would be to accept rule codes and rule names interchangeably -- so you could do --select F841 or --select whatever-that-rule-is-called.

Separately:

the lint name should make sense when read as "allow lint-name" or "allow lint-name items"

This I appreciate too. I've thought about it in the past (in particular, whether the rules should describe the positive form or the negative behavior), but we've never been consistent on it. (Some of the names were also drawn from the originating plugins, which explains some of the discrepancy.)

charliermarsh commented 1 year ago

Oh, it's worth mentioning: the one thing we lose by moving away from the F841-style system, is that we'll no longer have the ability to turn off sets of rules (apart from entire categories) with (e.g.) --select F8. I'm totally fine with this. I think it's quite rare to use that behavior -- anecdotally, most configurations tend to use precise codes (F841) and category codes (F) anyway.

spaceone commented 1 year ago

I use the prefix categories heavily, as in pycodestyle's E they are logically related. E.g. I am doing a tab-to-spaces migration by just selecting E1 (in autopep8).

Also I can remember a lot of codes but could not remember a lot of long names. And also wouldn't want to name them in a --select argument.

scop commented 1 year ago

I'd actually like to use the human friendly names as --select arguments rather than codes, provided that there's appropriate shell completion available for the args (currently there isn't). No need to remember that much, then. Some kind of prefix grouping would be highly useful in that scenario, too.

ngnpope commented 1 year ago

+1 for switching to human friendly names everywhere eventually.

I think we should pivot away from the current flake8-plugin grouping of rules entirely to rule categories (as per #1774). That includes the way that the rules are grouped in the repository.

And then only maintain a mapping of flake8-plugins/codes to ruff rules to show what has been implemented to aid migration. Ideally I'd like to see this documentation piece also refer to the original code prefixes (especially where ruff has changed them) and also state reasons for not implementing rules that the original plugin did, or why the implementation differs, e.g. F401+F403.

This would also overcome some of the confusion highlighted in #2517 as well as the fact that many plugins have overlap with each other so they're implemented in some areas in the code, but not others.

A couple of other thoughts:

When we have categories we could replace --select S1 with, say, --select @security?
We could also implement a rule, e.g. upgrade-noqa-comments, to rewrite the old codes to their human-friendly versions?

Avasam commented 1 year ago

Oh please yes! Exactly for the reason of adding comments so I can better understand what a rule does, and why it may be disabled, at a glance and not have to remember hundreds of codes.

I've just started trying out migrating to Ruff, and here's an example of a bunch of extra comments for that: https://github.com/Avasam/speedrun.com_global_scoreboard_webapp/blob/6678056ac2d42ee66b97fc074515855646f2bfc4/pyproject.toml

sscherfke commented 1 year ago

This would make # noqa comments more readable, too:

def complex_func():  # noqa: C901
    ...

vs.

def compoex_func(): # noqa: complex-structure
    ...

9999years commented 9 months ago

I would also like this! I found it confusing that the URL for G004 is https://docs.astral.sh/ruff/rules/logging-f-string/ but that --ignore logging-f-string didn't work.

Previous comments state that the names aren't exposed to users, but that isn't true any more:

$ ruff rule G004
# logging-f-string (G004)

Derived from the **flake8-logging-format** linter.

## What it does
Checks for uses of f-strings to format logging messages.

kaddkaka commented 8 months ago

for pylint rules (PL): it might be wise to just reuse the pylint names without any modification.

kaddkaka commented 7 months ago

What needs to be done to get this implemented? What's left and can a beginner in this repo help out?

gpshead commented 5 months ago

I pretty much agree with what was already summarized above. One lesson from pylint: rules will need multiple names. Only one should be considered the primary (presumably used in the error message and suggested for addition to noqa comments or use in a modern suppression config elsewhere). That way when you don't really like an existing name or come up with a more meaningful alternative name, or need to align with another tools name, you can. But all existing configs and directives in code still remain valid.

Pierre-Sassoulas commented 5 months ago

One lesson from pylint: rules will need multiple names.

Yes, the refactor to be able to have multiple aliases after the fact was very painful.

Only one message should be considered the primary

In pylint we call this, "old_names" and we have a primary name the only one that is active (it has an impact in the documentation in particular), but in fact aliases would be more accurate / more powerful. A possible use case is to permits to have one message be a placeholder for multiple one, so you can deactivate multiple similar messages using a single englobing one. For example, missing-docstring as a placeholder for missing-function-docstring / missing-module-docstring / missing-class-docstring. It's currently possible to do that in pylint but the semantic in the code / doc is not accurate. The documentation present missing-docstring as a deprecated message and not as a possible elegant disable shortcut and if we ever add a tool to upgrade the deprecated messages to the primary one, it would make the code worst / more verbose for nothing.

astral-sh / ruff

Human-friendly rule names #1773