Open obi1kenobi opened 1 month ago
Wouldn't this not be the case for repr(Rust)
unions since the compiler makes no guarantees about how fields overlap? I think the inference can only apply to repr(C)
unions.
I've been thinking about this too. I think it might be possible to trigger this in repr(Rust)
if there's no wiggle room for the fields' placement, for example:
pub union Example {
pub x: i64,
pub y: u64,
+ pub z: [bool; 8],
}
I believe that this addition is not sensitive to repr(Rust)
layout because the overlap between all fields is total. I believe this is still breaking. What do you think?
I don't know what the semantics or guarantees (if any) might be for a union of unevenly-sized types, like:
pub union Uneven {
pub first: i64,
pub second: bool,
}
If anyone has any idea where the bool
is supposed to end up in the union under repr(Rust)
, I'd love to know. My current impression is that it might not be well-defined without repr(C)
here, but I'm not confident about it.
With -Zrandomize-layout
I did expect Example
to have x or y start at a non-zero offset even when z doesn't exist.
Interesting! Is encountering a repr(Rust)
union an immediate code smell then? Should there be a clippy lint for that?
I'm happy to make our lints also require that the union be repr(C)
before they fire, but I'm not sure if that truly solves the problem you both have highlighted here.
repr(Rust)
unions are still useful, but have a more restricted set of sound operations. For example, it's still perfectly sound to use external data to determine which field is set. So a union might check some globally-configured value to determine which of its fields should be accessed or modified. Maybe you want to choose between storing a value inline versus boxed.
The main thing that you can't do with repr(Rust)
unions is to write one field and read another. That can only be sound if your union has a well-defined layout so you know the exact semantics of the transmute.
Nice, that addresses my questions. I updated the lint descriptions above to only fire on repr(C)
unions.
Thank you both for walking me through this, I appreciate it!
Removing a
pub
field from aunion
type is clearly a major breaking change. But how about adding a new field? I chatted with @joshlf about this, and the current plan is like this:Any public API [^1] field of a public API union may be read or written by downstream callers. There are two risks here:
unsafe
to read a field of that type. They would justify that this is sound because no matter what field was previously written, it's bit-compatible. Adding a new field that is not bit-compatible breaks that logic, regardless of whether the field ispub
or not.cargo-semver-checks
, so we'd trigger the lint for any field additions. If/when bit-pattern data is available, we can make the lint more fine-grained.Two lints, then:
repr(C)
union with no non-public-API fields adds new field, regardless of its visibility (major, deny by default)repr(C)
union with at least one non-public-API field adds new field, regardless of its visibility (major, warn by default)Notes to folks looking to contribute these lints:
pub
field" not "public API field", a subtle distinction).[^1]: "Public API" in
cargo-semver-checks
is defined as:pub and (not #[doc(hidden)] or #[deprecated])
. In other words, deprecated pub items are public API regardless of#[doc(hidden)]
, and#[doc(hidden)]
pub items are not public API unless deprecated. More info here: https://predr.ag/blog/checking-semver-for-doc-hidden-items/