obi1kenobi / cargo-semver-checks

Scan your Rust crate for semver violations.
Apache License 2.0
1.14k stars 74 forks source link

New lints: Adding new fields to `union` types #950

Open obi1kenobi opened 2 days ago

obi1kenobi commented 2 days ago

Removing a pub field from a union type is clearly a major breaking change. But how about adding a new field? I chatted with @joshlf about this, and the current plan is like this:

Any public API [^1] field of a public API union may be read or written by downstream callers. There are two risks here:

Two lints, then:

Notes to folks looking to contribute these lints:

[^1]: "Public API" in cargo-semver-checks is defined as: pub and (not #[doc(hidden)] or #[deprecated]). In other words, deprecated pub items are public API regardless of #[doc(hidden)], and #[doc(hidden)] pub items are not public API unless deprecated. More info here: https://predr.ag/blog/checking-semver-for-doc-hidden-items/

djkoloski commented 2 days ago

Wouldn't this not be the case for repr(Rust) unions since the compiler makes no guarantees about how fields overlap? I think the inference can only apply to repr(C) unions.

obi1kenobi commented 2 days ago

I've been thinking about this too. I think it might be possible to trigger this in repr(Rust) if there's no wiggle room for the fields' placement, for example:

  pub union Example {
      pub x: i64,
      pub y: u64,
+     pub z: [bool; 8],
  }

I believe that this addition is not sensitive to repr(Rust) layout because the overlap between all fields is total. I believe this is still breaking. What do you think?

I don't know what the semantics or guarantees (if any) might be for a union of unevenly-sized types, like:

pub union Uneven {
    pub first: i64,
    pub second: bool,
}

If anyone has any idea where the bool is supposed to end up in the union under repr(Rust), I'd love to know. My current impression is that it might not be well-defined without repr(C) here, but I'm not confident about it.

bjorn3 commented 2 days ago

With -Zrandomize-layout I did expect Example to have x or y start at a non-zero offset even when z doesn't exist.

obi1kenobi commented 1 day ago

Interesting! Is encountering a repr(Rust) union an immediate code smell then? Should there be a clippy lint for that?

I'm happy to make our lints also require that the union be repr(C) before they fire, but I'm not sure if that truly solves the problem you both have highlighted here.

djkoloski commented 1 day ago

repr(Rust) unions are still useful, but have a more restricted set of sound operations. For example, it's still perfectly sound to use external data to determine which field is set. So a union might check some globally-configured value to determine which of its fields should be accessed or modified. Maybe you want to choose between storing a value inline versus boxed.

The main thing that you can't do with repr(Rust) unions is to write one field and read another. That can only be sound if your union has a well-defined layout so you know the exact semantics of the transmute.

obi1kenobi commented 1 day ago

Nice, that addresses my questions. I updated the lint descriptions above to only fire on repr(C) unions.

Thank you both for walking me through this, I appreciate it!