obi1kenobi / cargo-semver-checks

Scan your Rust crate for semver violations.
Apache License 2.0
1.19k stars 75 forks source link

New lints: Adding new fields to `union` types #950

Open obi1kenobi opened 1 month ago

obi1kenobi commented 1 month ago

Removing a pub field from a union type is clearly a major breaking change. But how about adding a new field? I chatted with @joshlf about this, and the current plan is like this:

Any public API [^1] field of a public API union may be read or written by downstream callers. There are two risks here:

Two lints, then:

Notes to folks looking to contribute these lints:

[^1]: "Public API" in cargo-semver-checks is defined as: pub and (not #[doc(hidden)] or #[deprecated]). In other words, deprecated pub items are public API regardless of #[doc(hidden)], and #[doc(hidden)] pub items are not public API unless deprecated. More info here: https://predr.ag/blog/checking-semver-for-doc-hidden-items/

djkoloski commented 1 month ago

Wouldn't this not be the case for repr(Rust) unions since the compiler makes no guarantees about how fields overlap? I think the inference can only apply to repr(C) unions.

obi1kenobi commented 1 month ago

I've been thinking about this too. I think it might be possible to trigger this in repr(Rust) if there's no wiggle room for the fields' placement, for example:

  pub union Example {
      pub x: i64,
      pub y: u64,
+     pub z: [bool; 8],
  }

I believe that this addition is not sensitive to repr(Rust) layout because the overlap between all fields is total. I believe this is still breaking. What do you think?

I don't know what the semantics or guarantees (if any) might be for a union of unevenly-sized types, like:

pub union Uneven {
    pub first: i64,
    pub second: bool,
}

If anyone has any idea where the bool is supposed to end up in the union under repr(Rust), I'd love to know. My current impression is that it might not be well-defined without repr(C) here, but I'm not confident about it.

bjorn3 commented 1 month ago

With -Zrandomize-layout I did expect Example to have x or y start at a non-zero offset even when z doesn't exist.

obi1kenobi commented 1 month ago

Interesting! Is encountering a repr(Rust) union an immediate code smell then? Should there be a clippy lint for that?

I'm happy to make our lints also require that the union be repr(C) before they fire, but I'm not sure if that truly solves the problem you both have highlighted here.

djkoloski commented 1 month ago

repr(Rust) unions are still useful, but have a more restricted set of sound operations. For example, it's still perfectly sound to use external data to determine which field is set. So a union might check some globally-configured value to determine which of its fields should be accessed or modified. Maybe you want to choose between storing a value inline versus boxed.

The main thing that you can't do with repr(Rust) unions is to write one field and read another. That can only be sound if your union has a well-defined layout so you know the exact semantics of the transmute.

obi1kenobi commented 1 month ago

Nice, that addresses my questions. I updated the lint descriptions above to only fire on repr(C) unions.

Thank you both for walking me through this, I appreciate it!