rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.47k stars 12.45k forks source link

Policy for lint expansions #122759

Open tmandry opened 5 months ago

tmandry commented 5 months ago

Background

When a lint is expanded to include many new cases, it adds significant complexity to the rollout of a toolchain in the ecosystem. In particular, certain challenges occur in large codebases. Maintainers are stuck with the choice of

  1. Disabling the existing lint while the toolchain is updated and new cases are fixed.
  2. Fixing cases manually, and updating the toolchain immediately so that other developers see the warnings and don't add more such code.

Both of these come with the problem of racing with other developers in a codebase who may land new code which triggers the expanded lint in a new compiler, but does not trigger the lint in an old compiler.

While it would be nice to solve this "raciness" once and for all, there are other considerations at play. Instead, we propose a set of guidelines to help decide whether an existing lint should be expanded, as well as mitigations for the above problems when expansions occur.

These guidelines are based on the lang team discussion of #121708.

cc @estebank @petrochenkov

Guidelines

Deciding on new lint vs expansion

We don't have hard-and-fast rules for when to use a new lint versus expanding an existing one, but we want to take several factors into account:

We'd like to avoid having a disproportionately painful impact on the ecosystem compared to the value provided. We'd also like to avoid having a painful impact on large codebases that need to migrate their code to take the new lint into account.

When proposing or reviewing an expansion of an existing lint, please take the above factors into account. If in doubt about whether to use a new or existing lint, please consult with lang, and seek a decision via FCP.

Mitigations for lint expansions

When an existing lint is expanded to include many new cases, please consider providing at least one of:

  1. A new lint name under the existing group, so that users may opt out of the expansion at least temporarily. This is particularly desirable if there's a clear description of the new cases being added.
  2. A MachineApplicable fix for the lint. This fix must produce code that compiles without new lints or errors under the most recent stable compiler. This helps maintainers, particularly of large codebases, quickly fix all existing instances so they can upgrade/deny the lint, without racing with new code additions that trigger the lint. (Note, though, that a MachineApplicable fix doesn't suffice if applying the fix would make code fail to compile on the immediately prior version of Rust.)

To evaluate whether a lint produces "many new cases", try doing a crater run on the top 1000 crates. If the lint addition impacts more than 1% of the top-1000 crates on crates.io, or is producing a large number of warnings within a small number of crates in that set, treat it as producing many new cases. See below for details on how to run crater.

A crater run is not required before landing for every lint expansion. Reviewers should use their best judgment to decide if one is required.

Crater command

To measure the impact of a lint on the top 1000 crates, you can use the following crater command:

@craterbot run name=<name> start=master#<hash1>+rustflags=-D<lint_name> end=master#<hash2>+rustflags=-D<lint_name> crates=top-1000 mode=check-only p=1

See the crater docs for more information.

Lint revert policy

In general, if a lint expansion does not meet the above guidelines, and in particular especially if it has neither a new lint name nor a machine applicable fix, it should be flagged for review and likely reverted before it hits stable Rust.

tmandry commented 5 months ago

As discussed in last week's lang team meeting, I've drafted a policy for the team to review.

@rustbot label I-lang-nominated

estebank commented 5 months ago

Here, we define "many new cases" as impacting more than 5% of the top-1000 crates on crates.io. This can be measured by counting the number of regressions from a crater run like the one below.

I would likely be more stringent than that. Affecting just .5% would already make me consider mitigations.


There's an additional approach we could take: annotating all lints with a metadata field for "last significantly updated". This would allow tooling to account for Rust versions when triggering any lints. Then we could make #![deny(warnings)] be coupled with an additional attribute (or Cargo.toml field) to state "I want to mark warn-by-default lints as errors up until the version I support, and later warns remain as warnings at most". (I'm aware this is a separate issue, but believe the same set of users are affected by both simultaneously.)


For the original issue, I endorse splitting the lint into a new family.

tmandry commented 5 months ago

I would likely be more stringent than that. Affecting just .5% would already make me consider mitigations.

Yeah, I pulled a number out out of thin air. Let me continue doing that here. How about: If .5% would make you consider mitigations, maybe the policy line should be 1%?

There's an additional approach we could take: annotating all lints with a metadata field for "last significantly updated". This would allow tooling to account for Rust versions when triggering any lints. Then we could make #![deny(warnings)] be coupled with an additional attribute (or Cargo.toml field) to state "I want to mark warn-by-default lints as errors up until the version I support, and later warns remain as warnings at most". (I'm aware this is a separate issue, but believe the same set of users are affected by both simultaneously.)

I like this idea too. It would work well for those releasing to their codebase from discrete Rust versions. Those releasing from nightly might not have such a good time unless we manage to include the merge date somehow.

joshtriplett commented 3 months ago

We still need to figure out where this should live (e.g. rustc-dev-guide, somewhere on forge), but :+1: for documenting guidelines for how we evaluate both new lints and expansions of existing lints.

Let's see if we have consensus on the proposal in the top post here:

@rfcbot merge

rfcbot commented 3 months ago

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns. See this document for info about what commands tagged team members can give me.

RalfJung commented 3 months ago

I don't quite understand how expanding an existing lint is any worse than adding a new (warn-by-default) lint. Reading the motivation here

Maintainers are stuck with the choice of

  1. Disabling the existing lint while the toolchain is updated and new cases are fixed.
  2. Fixing cases manually, and updating the toolchain immediately so that other developers see the warnings and don't add more such code.

This applies equally to the case of landing a new lint, doesn't it? If you have developers using outdated compilers, they may submit code that would trigger a lint on newer compilers. If you allow the lint, you may miss out on genuine bugs.

We have a technology that can solve issues like "automatically ensure developers submit code that adheres to a certain standard", it's called CI. Why should Rust compiler developers have to bear the cost for people not updating their compilers?

ChrisDenton commented 3 months ago

If you have a new lint you can ignore specifically that lint without ignoring the old one.

Note that also an "outdated compiler" can mean stable. Testing in CI on nightly is common and can avoid having to deal with multiple lint changes all at once.

RalfJung commented 3 months ago

Testing in CI on nightly is common and can avoid having to deal with multiple lint changes all at once.

Sure, but there's no problem with that -- unless you set -D warnings on your nightly build, but that's a Bad Idea, not something we should accommodate.

tmandry commented 3 months ago

If you have developers using outdated compilers, they may submit code that would trigger a lint on newer compilers.

Repos that are large or complex enough typically control the version their developers are using, e.g. with rust_toolchain.toml or some other mechanism, to maintain consistency between the local workflow and CI.

If you allow the lint, you may miss out on genuine bugs.

True, but that is one consideration among many to be made when managing the rollout of an updated compiler with new lints. Often it is easier to update a compiler with a new lint turned off while concurrent changes are made, then enable the new lint in a follow-up change. Consider that in large projects, there are many other potential blockers to upgrading a toolchain: Compiler bugs, latent bugs in the build exposed by changes in the compiler, behavior changes in related tools like clippy and rustfmt, and so on.

In the case of the lint discussed in https://github.com/rust-lang/rust/issues/121708, an existing lint was massively expanded while catching probably zero new bugs – this was purely an aesthetic choice about redundant imports. That is an example of something that should definitely be a separate lint group under this policy. Expansion of a lint that catches UB should be given more leeway under this policy: "We'd like to avoid having a disproportionately painful impact on the ecosystem compared to the value provided."

We have a technology that can solve issues like "automatically ensure developers submit code that adheres to a certain standard", it's called CI. Why should Rust compiler developers have to bear the cost for people not updating their compilers?

The difficulty that motivated this policy in the first place was the pain of very large monorepos not being able to upgrade their compiler when they wanted to. Intuitions that apply to smaller repos on Github don't apply to these projects. Updating a compiler is a change in the repo itself as I said above, and there are so many other changes happening concurrently that it becomes unmanageable to do so if a high-prevalence lint is turned on as part of that change.

RalfJung commented 3 months ago

In the case of the lint discussed in https://github.com/rust-lang/rust/issues/121708, an existing lint was massively expanded while catching probably zero new bugs

If you don't consider unnecessary imports bugs, then -- true. It did find a bunch of unnecessary imports in my own fairly small codebases though, so I expect it found a lot of those out there as well.

If you don't consider unnecessary imports bugs, then there should be no problem with allowing that lint in the changeset that bumps the compiler, and then later getting the lint to warn can follow whatever process you usually use for that (and it doesn't have to happen for the entire repo at once).

So I don't think I understand the problem. Since you're fine with extending the scope of lints that detect more serious problems, the policy seems to be about "soft" problems only where the issues they find are not considered bugs, and then a temporary allow seems entirely fine.

We'd like to avoid having a disproportionately painful impact on the ecosystem compared to the value provided."

You're not speaking about the entire ecosystem here, but about large monorepos specifically. For the ecosystem at large I don't think this was painful -- it was a nice cleanup. That's based on my extremely limited view, so please correct me if I am wrong -- but the arguments given so far did not apply to the ecosystem at large, but just for the specific case of big companies' internal monorepos.

traviscross commented 3 months ago

One reason that #121708 is an unfortunate[^1] example is that, regardless of the noise created by the lint expansion, the lint was broader than we actually wanted it to be for other important reasons, as we later decided here in #123813.

[^1]: It's unfortunate in the sense of "hard cases make bad law", though, by that citation, I mean only that the hardness of the example makes this discussion more challenging, not that the resulting policy is bad.

RalfJung commented 3 months ago

Right, so for this particular lint you actually want (some of?) the new cases to be allow by default. That is unfortunate but I understand why.

However that makes it an even less suited precedent/argument for this policy, so this just adds to the arguments I gave above.

nikomatsakis commented 3 months ago

I had a hard time understanding the proposed policy. I appreciate that making hard-and-fast rules are hard but I think we should give a few more examples at least of times when a new vs expanded lint would be preferred.

I also think we should be annotating lints with the compiler version in which they were introduced (or massively expanded) and letting toolchain authors say "error for all cases that are older than the last few revisions", but that's a separate feature request.

tmandry commented 2 months ago

If you don't consider unnecessary imports bugs, then there should be no problem with allowing that lint in the changeset that bumps the compiler, and then later getting the lint to warn can follow whatever process you usually use for that (and it doesn't have to happen for the entire repo at once).

This would have meant allowing an existing very-high-precedence lint that people actually use while updating the compiler. Even if the existing lint didn't catch bugs, it was still useful because it's a standard part of people's workflow. Disabling it would mean degrading the existing user experience, and then having a lot of manual cleanup to do afterward – because there is no MachineApplicable fix.

I have also heard the argument made that unused imports are different from redundant imports in that the first can plausibly catch bugs, while the latter is always just a cleanup. But I'm not sure how much weight I would put on it.

You're not speaking about the entire ecosystem here, but about large monorepos specifically. For the ecosystem at large I don't think this was painful -- it was a nice cleanup. That's based on my extremely limited view, so please correct me if I am wrong -- but the arguments given so far did not apply to the ecosystem at large, but just for the specific case of big companies' internal monorepos.

Two things can be true at the same time: It can be both disruptive and a nice cleanup! But something that touches a high percentage of crates should be rolled out carefully.

I see what you're saying about the "ecosystem" phrasing – the phrasing was not actually my suggestion but @joshtriplett's, and it seems a bit disingenuous given that all my examples are from large monorepos – but I did feel it was warranted given the number of backlinks on #117772. That may be partly due to the issue mentioned by @traviscross.

tmandry commented 2 months ago

I had a hard time understanding the proposed policy. I appreciate that making hard-and-fast rules are hard but I think we should give a few more examples at least of times when a new vs expanded lint would be preferred.

You know, I actually had such a hard-and-fast rule as part of the initial proposal. Reading back through the revised guidelines, I think it works as motivation/rationale for a broader policy, but in the interest of making something clear and easy to follow I think I would like to go back to the simple rule that was here before (or something close to it).

When an existing lint is expanded to include many new cases, we must provide either:

  1. A new lint name under the existing group, so that users may opt out of the expansion at least temporarily, or
  2. A MachineApplicable fix for the lint. This fix must produce code that compiles without new lints or errors under the most recent stable compiler.

Exceptions to this policy may be made via Language Team FCP.

RalfJung commented 2 months ago

This would have meant allowing an existing very-high-precedence lint that people actually use while updating the compiler. Even if the existing lint didn't catch bugs, it was still useful because it's a standard part of people's workflow. Disabling it would mean degrading the existing user experience, and then having a lot of manual cleanup to do afterward – because there is no MachineApplicable fix.

To me this all sounds somewhat contradicting, you basically want your cake and eat it, too (based on your repo's particular preferences for which lints should be enabled when), while rustc should then bear the complexity required to achieve this.

I have also heard the argument made that unused imports are different from redundant imports in that the first can plausibly catch bugs, while the latter is always just a cleanup. But I'm not sure how much weight I would put on it.

The lint in question seems to be a very bad example for motivating such a policy, given that it was later determined that many of the new cases should be allow anyway. I think this example should be discarded from the proposal for that reason. Is there any other lint where you have similar concerns, or is this now entirely hypothetical?

nikomatsakis commented 2 months ago

I disagree Ralf that this problem is specific to Google's requirements. I can definitely say that within Amazon existing lint policy is a serious pain point and it is something that the Rust org has been hearing from users for a long time.

I still want us to keep aggressively improving and updating lints but giving users tools to manage those updates is part of how we are able to do that without unduly disrupting users' workflows.

That said, I am not entirely convinced that the proposed policy is the way to do it. In my ideal world we would make "lint management " more of a first class feature and have an RFC, if for no other reason then I think this will be important to a number of users and merits a clearer workflow.

But I don't want to let perfect be enemy of the good either, and I know that's a much higher bar.

I am mildly worried about a proliferation of lints -- lint names and lint groups are permanent.

On Wed, Jun 5, 2024, at 2:05 AM, Ralf Jung wrote:

This would have meant allowing an existing very-high-precedence lint that people actually use while updating the compiler. Even if the existing lint didn't catch bugs, it was still useful because it's a standard part of people's workflow. Disabling it would mean degrading the existing user experience, and then having a lot of manual cleanup to do afterward – because there is no MachineApplicable fix.

To me this all sounds somewhat contradicting, you basically want your cake and eat it, too (based on your repo's particular preferences for which lints should be enabled when), while rustc should then bear the complexity required to achieve this.

I have also heard the argument made that unused imports are different from redundant imports in that the first can plausibly catch bugs, while the latter is always just a cleanup. But I'm not sure how much weight I would put on it.

The lint in question seems to be a very bad example for motivating such a policy, given that it was later determined that many of the new cases should be allow anyway. I think this example should be discarded from the proposal for that reason. Is there any other lint where you have similar concerns, or is this now entirely hypothetical?

— Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/122759#issuecomment-2148945086, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABF4ZVLS3YTYDHRO5L6UNDZF2TDXAVCNFSM6AAAAABE6UQXROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBYHE2DKMBYGY. You are receiving this because you were mentioned.Message ID: @.***>