rust-lang / rust-clippy

A bunch of lints to catch common mistakes and improve your Rust code. Book: https://doc.rust-lang.org/clippy/
https://rust-lang.github.io/rust-clippy/
Other
11.09k stars 1.49k forks source link

[`similar_names`]: Configuration option to whitelist specific names #10926

Open Centri3 opened 1 year ago

Centri3 commented 1 year ago

Description

Currently, similar_names will lint on code like

fn awa(tcx: TyCtxt<'_>) {
    let ocx = { ... };
}

Which is something I ran into on #10891. IMO, tcx and ocx are short enough that you can tell them apart. Despite this, simply setting a minimum length isn't desirable; What about lli and lll? Therefore, I think this lint should have a configuration option that is by default empty that allows specific names.

@rustbot claim

Version

No response

Additional Labels

@rustbot label +C-enhancement

eloc3147 commented 1 year ago

Just ran into this as well with msb and lsb.

There is a hardcoded list of "names that are allowed to be similar". This list could be added to, but I can imagine many scenarios where using some similar domain-specific acronyms would make the most sense. I agree this should be configurable

tgross35 commented 9 months ago

Maybe an easy refinement rule is to skip this check if the first letters are visually different. That would cover the txc/ocx, msb/lsb, slen/dlen cases, and I think probably a good number of other cases that exist. A list would also be nice, but I think that clippy could probably be a bit smarter here.

A slightly more complex but more general algorithm that could make sense is something like:

fn raise_similar_lint(mut a: &str, mut b: &str) -> bool {
    // comparing identifiers a and b

    // commonize similar looking characters, e.g. l->i, 1->i, 5->s
    // from here on out, we are comparing visual difference
    a = replace_similar(a)
    b = replace_similar(b)

    // If the first characters are visually different, call it good
    if a[0] != b[0] {
        return false;
    }

    let lev = levenshtein(a, b);

    // If the difference is large relative to the string's length, call it OK. 
    // This means that `abb` and `abc` are ok, but `abbbb` and `abbbc` are not
    lev * 4 < a.len()
}
tgross35 commented 5 months ago

I did part of my above suggestion in https://github.com/rust-lang/rust-clippy/pull/12258. Not raising this lint if the first letter is different removes almost all of the cases I run into.

Doing something with replacing and then levenshtein is probably still more accurate yet, but this is a good start.

demurgos commented 1 month ago

req/res are another example of a common identifier pair when remote calls are involved. They don't differ in the first letter. Having some way to configure the list of exceptions would be helpful.