obi1kenobi / cargo-semver-checks

Scan your Rust crate for semver violations.
Apache License 2.0
1.19k stars 75 forks source link

Tracking issue: Additional checks, both semver and non-semver #5

Open obi1kenobi opened 2 years ago

obi1kenobi commented 2 years ago

This is a list of all not-yet-implemented checks that would be useful to have. Some of these require new schema and adapter implementations as well, tracked in #241.

In addition to checking for semver violations, there are certain changes that are not breaking and don't even require a minor version, but can still be frustrating in downstream crates without a minor or major version bump. Crates should be able to opt into such warnings on an individual basis.

For example, based on this poll (with small sample size: ~40 respondents), ~40% of users expect that upgrading to a new patch version of a crate should not generate new lints or compiler warnings. The split between expecting a new minor version and a new major version was approximately 3-to-1.

Major version required

Minor version recommended

Project-defined whether major / minor / patch version required

For example, because they are technically breaking but projects may often treat them as non-major.

General opt-in warnings

Opt-in warnings for difficult-to-reverse changes

More checks to triage here

CAD97 commented 2 years ago

Own public types should be Debug:

That's already available in the compiler as #[warn(missing_debug_implementations)], isn't it?

obi1kenobi commented 2 years ago

That's already available in the compiler as #[warn(missing_debug_implementations)], isn't it?

Oh, neat, TIL. It appears to be allowed by default and has to be enforced by manually enabling the check. In that case, perhaps the wish-listed query should be checking that #![deny(missing_debug_implementations)] is set instead.

epage commented 2 years ago

This also gets into a conversation that I think we only had over zulip so good to summarize here.

Especially if we want this in cargo some day, I think we should clearly define the scope.

cargo clippy is meant for linting an API as it exists

cargo semver-checks would be meant for linting changes in an API

Misc notes

obi1kenobi commented 2 years ago

One possible way forward would be something like:

That way, we could easily experiment with querying for more things without bloating the scope of cargo-semver-checks and without making the integration into cargo messy.

I think extracting the data model into a library crate is pretty straightforward and I would be happy to do it if that's what we decide is the best path forward.

aDotInTheVoid commented 2 years ago
obi1kenobi commented 2 years ago

Thanks, added to the list! If you'd like to try your hand at it, this lint is probably easier than pub fn changed return type since the actual check is less complex, and I'd be happy to mentor.

oskgo commented 2 years ago

I think "trait added method" might be a bit more complicated.

The way I see it adding default methods is fine, and even adding non-default methods is fine, so long as the trait cannot be implemented by an external user. This can be the case for example when using a private super trait or blanket impls. Especially sealed traits are a common pattern in Rust.

epage commented 2 years ago

even adding non-default methods is fine,

I believe trait added methods are a minor compatibility break. The standard library team is running into this problem with moving functions from the extension trait in itertools to Iterator which is causing them to write a new feature to support this due to the pervasiveness of itertools.

obi1kenobi commented 2 years ago

Non-defaulted items of any kind in a trait that is implementable outside its crate are semver-major, because any implementers must add the new items: https://doc.rust-lang.org/cargo/reference/semver.html#trait-new-item-no-default

Defaulted items in a trait are trickier. They are definitely at least minor, but could be major as well; some such circumstances are described in the semver reference which shows this as a "possibly-breaking" change: https://doc.rust-lang.org/cargo/reference/semver.html#trait-new-default-item

I believe trait added methods are a minor compatibility break. The standard library team is running into this problem with moving functions from the extension trait in itertools to Iterator which is causing them to write a new feature to support this due to the pervasiveness of itertools.

I believe this might be due to the introduced ambiguity between the built-in Iterator trait and its itertools analogue, which is captured in the breaking example of the possibly-breaking entry I linked above.

jplatte commented 2 years ago

it's possible to go from e.g. taking &str to taking S: Into<String> without breaking

This is not true, changing an argument type from a concrete type to a generic will break calls like the_function(foo.into()), which only works for non-generic functions because the parameter type guides type inference. There are cases where changing a parameter types as well as a return type is non-breaking though:

CAD97 commented 2 years ago

I want to note that while Iterator/Itertools is a good example of the issue, it's a symptom of the wider semver-minor upgrade hazard of adding any new items.

This happens because of how name lookup works in Rust, since Rust allows arbitrary namespace mixins.

In other words, in a pedantic mode, semver-checks'd be justified on requiring minor for any new public item. Even weakening generic requirements might cause inference issues, so e.g. --strict-pedantic should probably require a minor bump for any change to the public API's types; IIUC this matches the intent of semver-minor's "new feature" trigger as well, since the API by construction has new API surface.

In practice, API evolution in this manner is necessarily considered perfectly acceptable, and is imho very rarely worth warning for. It's a subjective evaluation of how likely both that a name conflict is possible and that some downstream would have both names in scope simultaneously; in most cases this is reasonably rare because of the convention to avoid glob imports^[If you want a version of the lint which can fire without firing on every API change, consider linting only for new trait associated items reachable through a module called prelude, since that's likely designed for glob importing.], and there's not really a good analytical way to determine the risk of a non-globbed name conflict to provide a lint cutoff better than yes/no.

Iterator is an especially interesting case because it's a language item trait in the prelude. User types don't have this exacerbating factor (being implicitly available everywhere) for this concern.

CAD97 commented 2 years ago

Adding trait bounds to an existential return type, e.g. -> impl Foo to -> impl Foo + Send

Note that RPIT already "leaks" autotraits (Send/Sync/Unpin), so that isn't actually a return type refinement.

Actually refining the return type is not-inference-breaking, though you still run the risk of being name-resolution-breaking (e.g. refining to a concrete type or even adding a new guaranteed trait could cause a name conflict with newly applicable extension traits).

epage commented 2 years ago

In practice, API evolution in this manner is necessarily considered perfectly acceptable, and is imho very rarely worth warning for

The fact that there is a lot of nuance to semver and some parts that are contextual is why I feel like https://github.com/obi1kenobi/cargo-semver-check/issues/58 is going to be important.

jhpratt commented 1 year ago

Here's one not listed: adding a generic (type or const) to a function is a breaking change.

obi1kenobi commented 1 year ago

That one is very interesting, and I'm not exactly sure what to make of it.

It's definitely breaking, no doubts there. But as I've written before, some Rust breaking changes don't require a major version ā€” the API evolution RFC says so: https://rust-lang.github.io/rfcs/1105-api-evolution.html

Is adding a generic to an already-generic function covered under that exception?

Is adding a generic to a function that previously wasn't generic covered?

Would love to get your thoughts @jhpratt! These kinds of existential questions are things we run into a lot here šŸ˜…

epage commented 1 year ago

Yes, adding a new generic to a function is a major breaking change. It will break in any case where the type parameters are explicitly provided, like if inference didn't work out. This is true for other API items as well with one exception: if a generic is added to a type or trait but is defaulted, then its a minor breaking change. The explicit type parameters are unchanged but there are cases where inference won't work and it will fail to compile and the API Evolution RFC decided to brush that under the rug and ignore it (I've been bitten by it...)

Also, just because the RFC says something is a minor semver breakage, that doesn't mean the user shouldn't be told as that RFC was written specifically for the stdlib and not as guidance for the ecosystem and even the stdlib sometimes goes to extra lengths to avoid minor breaking changes if the impact is large enough. From what I've heard, they are designing a whole new language feature to allow migrating trait methods from itertools to Iterator without breaking people. Granted, #58 will be important for controlling this.

thomaseizinger commented 1 year ago

type no longer implements pub trait

How hard is this one to implement? I assume this covers things like removing the From impl from an error type? I'd like to try myself on that one.

obi1kenobi commented 1 year ago

type no longer implements pub trait

How hard is this one to implement? I assume this covers things like removing the From impl from an error type? I'd like to try myself on that one.

It does cover removing From from a type. Unfortunately, From is generic, and queries over generics are blocked on #241 as the hardest-to-design bit of schema in that issue. I wouldn't recommend it as a starting point, since it's likely to turn into yak shaving.

A better first issue would be something like #368 where the design is reasonably clear already.

In the meantime, I'm going to migrate the adapter implementation from Trustfall v0.2 to Trustfall v0.3 and take advantage of the massively improved ergonomics therein. If you'd like, I can loop you into that as well!

thomaseizinger commented 1 year ago

Leaking or re-exporting another crate's type in one's own API

* for example, having a function that returns a value of another crate's API

* this can cause coupling to the other crate's version, and can be a pain

* there are legitimate reasons to do this sometimes, but it should be an intentional decision and probably worth flagging in review

Two thoughts regarding this:

  1. It would be great if we could somehow specify the intention that a certain item is meant to be hidden from the public API. For example, it is very easy and common to leak a dependency via a From impl for that dependencies Error type.
  2. A reasonable default for the above could be to lint against all dependencies that are < 1.0 and appear in the public API. For a crate that is itself < 1.0, this could be allowed by default but as soon as you bump to 1.0, it should be a warn. If a crate wants to stabilise their public API, they can then opt-in to that lint ahead of time.

See https://rust-lang.github.io/api-guidelines/necessities.html#public-dependencies-of-a-stable-crate-are-stable-c-stable.

epage commented 1 year ago

It would be great if we could somehow specify the intention that a certain item is meant to be hidden from the public API. For example, it is very easy and common to leak a dependency via a From impl for that dependencies Error type.

Except there is no way to convey this intention to your users. If you implement a public trait on a public type, then that is a compatibility boundary. I avoid From for error types for this very reason.

thomaseizinger commented 1 year ago

It would be great if we could somehow specify the intention that a certain item is meant to be hidden from the public API. For example, it is very easy and common to leak a dependency via a From impl for that dependencies Error type.

Except there is no way to convey this intention to your users. If you implement a public trait on a public type, then that is a compatibility boundary. I avoid From for error types for this very reason.

What I meant was, I want to specify to cargo semver-checks that I want crate XYZ not in my public API. If I make a mistake and still include it, it should generate a warning.

epage commented 1 year ago

For that, long term we want https://rust-lang.github.io/rfcs/1977-public-private-dependencies.html

Nemo157 commented 11 months ago

Would it be useful to have a list of known undetected breakages to test against too? https://github.com/RustCrypto/elliptic-curves/issues/984 isn't detected currently and doesn't appear to match any of the checks in the list, it's something like "trait associated type added new required bound".

obi1kenobi commented 11 months ago

Thanks! I updated the list to add that check together with the analogous one for removing bounds from an associated type.

Would it be useful to have a list of known undetected breakages to test against too?

Could you say more about this? I'm curious what form this list would take, and how it would be related to / different from the list in this issue.

Nemo157 commented 11 months ago

Rather than being a list of checks, just a list of version pairs that have seen known ecosystem breakage, but pass all current checks. Maybe even something that can run in CI automatically to see if they start being detected.

obi1kenobi commented 11 months ago

Sorry, I'm still having a bit of trouble understanding the exact suggestion, and who the target audience is / how they benefit.

Would this list of version pairs be something posted in this issue, or part of cargo-semver-checks itself in some way?

When you say "run in CI automatically," is that referring to cargo-semver-checks' own CI, or in the CI of users of cargo-semver-checks?

Sorry I'm having a hard time following here. If it's easier to "show, not tell" I'd be happy to look at a PR too.

Skgland commented 11 months ago

Based on https://github.com/rust-lang/rfcs/pull/3535#discussion_r1429134787 which I reproduced in https://github.com/Skgland/rust-semver-break.

It is currently possible in some cases to match non-exhaustive structs exhaustively, resulting in a breaking change if such a struct is change to have more states (i.e. by adding a field with more than one value).

This is the case if the struct is StructuralPartialEq (constants of the type can be used as a pattern in match) and all possible values of the struct have an accessible constant.

This appears to be missing from this list, though I dought that it is feasible to detect.

obi1kenobi commented 11 months ago

Wow, that's quite the semver hazard! Thanks for pointing it out.

My preference, as I mentioned in the linked issue, would be to either error or lint on this inside rustc or clippy, since a #[non_exhaustive] type having exhaustive semantics seems to me like an accidental language or compiler bug.

If that doesn't pan out, we can look into our options here and see if we can check for StructuralPartialEq in some way.

parasyte commented 5 months ago

"Auto trait impls for impl Trait in return type" came up recently on URLO: Implicit Unpin on impl Any, breaking change possible

obi1kenobi commented 3 months ago

I have figured out how to properly check if a trait is sealed or not, and I've opened #870 with a list of related lints that are now ready to be implemented!

This took 9 months to get right, and I'm excited it's finally there! šŸ™Œ