Deterministic (but undefined) layout

rust-lang / unsafe-code-guidelines

Forum for discussion about what unsafe code can and can't do

https://rust-lang.github.io/unsafe-code-guidelines

Apache License 2.0

666 stars 58 forks source link

Deterministic (but undefined) layout #35

Open nikomatsakis opened 6 years ago

nikomatsakis commented 6 years ago

From #31: Can we say that layout is some deterministic function of a certain, fixed set of inputs? This would allow you to be sure that if you do not alter those inputs, your struct layout would not change, even if it meant that you can't predict precisely what it will be. For example, we might say that struct layout is a function of the struct's generic types and its substitutions, full stop -- this would imply that any two structs with the same definition are laid out the same. This might interfere with our ability to do profile-guided layout or to analyze how a struct is used and optimize based on that. (Some would call that a feature.)

Also, this presumably applies to enums as well as other types.

nikomatsakis commented 6 years ago

Key comments from #11:

@asajeffrey writes:

Can we say something like we guarantee that the layout for a struct is only dependent on certain properties of its fields? Off the top of my head those are size, alignment and non-zero-ness, but there may be others. This would avoid having to propose an RFC with a language extension.

@RalfJung writes:

So, after all this talk about layout guarantees for structs... could we do a little practical exercise and see if rust-lang/rust#54922 if within bounds of the guarantees, or exceeding them? This does some manual layout computation that is supposed to recompute the layout of a repr(Rust) struct with an unsized tail.

@Gankro writes:

Quickly chiming in with a +1 for restricting repr(rust) to only reordering and still mandating natural/greedy c-style padding. This would make (align=1,size=0) types not affect layout, newtypes not affect layout, alignment predictable, and the size of heterogeneous types predictable. At the same time it would still allow us to do the only really relevant optimization of ordering the fields optimally. The fact that a theoretical field fuzzer wouldn't be allowed to be as powerful as it could be is not especially concerning to me.

@gnzlbg writes (in response to @asajeffrey):

Can we say something like we guarantee that the layout for a struct is only dependent on certain properties of its fields? Off the top of my head those are size, alignment and non-zero-ness,

I don't know. For example, consider the repr(Rust) struct F { x: f32, y: f32, z: f32, w: f32 }, under the assumption that f32 has an alignment of 4 in the platform being targeted - the maximum alignment of a field of F is then 4. Would it be ok for the compiler to increase the alignment of F to say, 16, to facilitate SIMD operations?

We have been talking about guaranteeing the layout of homogenous tuples / structs, but I also imagine that this could happen somewhere else, e.g., struct G { a: u64, x: f32, y: f32, z: f32, w: f32, b: u64 } where {xyzw} get re-ordered at the front and the alignment increased to 16 to facilitate SIMD operations.

nikomatsakis commented 6 years ago

Sorry if I missed any.

the8472 commented 6 years ago

deterministic layout would disallow intentional randomization for fuzzing or defense purposes
as written it would have to remain stable between compiler versions, e.g. to allow serialization or punning between libs compiled with different versions, which would prevent future changes. In essence the current implementation would become the implicit but ostensibly undefined standard
this could have spooky action at a distance where upgrading a dependency which is a member of a struct suddenly causes the struct itself to change layout

So I would like to pose the counter-question: Why should this be guaranteed?

ChrisJefferson commented 6 years ago

I am +1 with @the8472 . My experience of C is that these kinds of limitations get annoying as time goes by, because newer ideas and compilers are limited by the old restrictions.

I'd actually be in favour of the compiler gaining randomisation, because (in my experience) even if something is technically allowed to vary, if it doesn't tend to change then people start to expect it being fixed.

Also randomisation seems (to me) a possibly cheap and effective anti-hacking technique -- if I can compile a program with a randomised order for every struct, it will be much harder for someone to make a repeatable hack, even if they find a suitable bug, as they won't know the layout of anything. Of course it isn't perfect by any means, but it's another layer of defense and I would want to see a good reason to throw it away.

If someone wants a well-defined fixed ABI, they can always use repr(C).

rodrimati1992 commented 6 years ago

Deterministic layout seems appealing,especially if it is determined by the fields,not the generic parameters,

Even if layout is not deterministic,there should be ways to recover back some determinism through attributes.

One example attribute would be '#[phantomtype(C)]' which would guarantee that the generic parameters it lists cannot affect the layout of the type,only allowing PhantomData/types with the same attribute to mention those generic parameters.

#[phantomtype(C)]
struct WithMetadata<T,C>{
    value0:T,
    value1:Vec<u8>,
    _metadata:PhantomWrapper<C>,
}

#[phantomtype(C)]
struct PhantomWrapper<C>(PhantomData<fn()->C>)

The phantomtype attribute would allow these transmutes to be safe (always keeping the same T) :

WithMetadata<u32,()>> to WithMetadata<u32,Initialized>
WithMetadata<Rectangle<u64>,Uninitialized>> to WithMetadata<Rectangle<u64>,Initialized>
Arc<WithMetadata<String,NoneMetadata>> to Arc<WithMetadata<String,SomeMetadata>>
Box<WithMetadata<Arc<LargeStruct>,U10>> to Box<WithMetadata<Arc<LargeStruct>,U20>>

The phantomtype attribute would not guarantee these having the same memory layout:

WithMetadata<u32,()>> compared to WithMetadata<i32,()
WithMetadata<(),()>> compared to WithMetadata<PhantomData<fn()->Vec<u32>>,()>
WithMetadata<Vec<usize>,()>> compared to WithMetadata<Vec<isize>,()>

the8472 commented 6 years ago

That's quite a special case that does not generalize well to the myriad of heterogeneous structs with multiple non-ZST fields. It would fall under #34 already.

rodrimati1992 commented 6 years ago

One example where it doesn't apply? This is the very specific problem of wanting to guarantee stable layout for types with phantom generic parameters(when the phantom type parameter changes).

the8472 commented 6 years ago

Well, it would still be determined by some of its generic parameters, T in that case, obviously each concrete T can result in different layouts even though it's always the same generic fields being present.

Gankra commented 6 years ago

I stand by my exact request; I'm fine with field re-ordering being the exact thing allowed.

the8472 commented 6 years ago

@Gankro couldn't that prevent some optimizations, e.g. reducing alignment (and thus size) when there are no references are taken on a private member?

strega-nil commented 6 years ago

@the8472 that's legal under as-if

RalfJung commented 6 years ago

@Gankro We are discussing here whether we can say that layout is deterministic (with some list of inputs that may affect layout, and nothing else may). You are talking about constraining what layout may do, irrespective of its inputs. That's an entirely orthogonal discussion.

@ubsan "as-if"?

@rkruppe also writes in the PR:

I am not sure this wording reserves the freedoms it want to reserve, and even if someone argued it does I would like it to be clarified. Since we did not get consensus to rule out profile-guided layout, the program source code is not all that informs layout. Not even if that includes build scripts, input data files, etc. that are used during the profiling run, since the program may not be deterministic w.r.t. these (e.g. it might have race conditions or depend on the system time or ...). So I don't think we can guarantee anything involving the words "deterministic function" and have to stick to something like "every time you invoke the compiler you may get a completely different layout".

Even with PGO, we should be able to guarantee that given the same PGO profile data, we compute the same layout. I do not know how PGO works, but I assume there's a "collection" phase spitting out some data into a file and then that file is used by the next compilation? So that file would be part of the input to the deterministic function.

the8472 commented 6 years ago

@RalfJung even with the same input couldn't slight changes to the program plus optimization fuel change which structs get optimized and which don't?

alercah commented 6 years ago

I'm a big fan of randomized layout personally. I'd much more happily assert that layouts can be randomized rather than try to offer any guarantees about determinism besides those that we view are required for interoperability. On Sat, 13 Oct 2018 at 08:04 the8472 notifications@github.com wrote:

@RalfJung https://github.com/RalfJung even with the same input couldn't slight changes to the program plus optimization fuel change which structs get optimized and which don't?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rust-rfcs/unsafe-code-guidelines/issues/35#issuecomment-429536463, or mute the thread https://github.com/notifications/unsubscribe-auth/AT4HVShyH7idIV5u8KaR7A2yKuVTT1rfks5ukdbHgaJpZM4XYCOx .

gnzlbg commented 6 years ago

@rkruppe

Not even if that includes build scripts, input data files, etc. that are used during the profiling run, since the program may not be deterministic w.r.t. these (e.g. it might have race conditions or depend on the system time or ...).

We might not want to think of the source code "written by the user" as part of the input to this deterministic function, but only think about the Rust that's left after macro expansion for this. If build scripts, proc macros, etc. have non-determinism, then that just produces a different "program".

@alercah

I'm a big fan of randomized layout personally. I'd much more happily assert that layouts can be randomized rather than try to offer any guarantees about determinism besides those that we view are required for interoperability.

It doesn't have to be either or. I sometimes, during debugging, want fully deterministic builds to reliably reproduce an issue, and other times, also during debugging, want fully randomized layouts to see if one bug is caused by some layout assumption. This screams "implementation-defined", but that does not mean that the spec cannot say anything more about it. The spec could say that: "layout is implementation defined but must be defined in one of the following ways: ..." leaving it to the implementation to pick one (e.g. depending on build flags), but not allowing implementations to invent new ones.

In any case, it might be interesting to come up with "definitions" for the relevant cases that we might want to have, and then reconsider what to do here. If we decide to make it defined to one of these, or implementation defined, then we just pick one or some of the definitions, and if we leave it as unspecified, then we drop them.

the8472 commented 6 years ago

I guess @RalfJung is aiming for reproducible builds, correct? Maybe the solution is to make struct layout conditionally deterministic.

Preconditions: no optimization fuel limit, no randomization, ??? Inputs: unordered sizes and alignments of struct members, layout-relevant attributes, PGO profile, compiler version, target specification, optimization flags, source (whole program? compilation unit?)

Some Inputs can also be shifted to preconditions. I.e. we can simply say it's only deterministic if certain optimization flags are disabled.

RalfJung commented 6 years ago

Builds should be reproducible, yes (not sure if that is a sufficient condition though). I got nothing against a flag to randomize stuff for debugging, but we shouldn't do that in normal release builds.

even with the same input couldn't slight changes to the program plus optimization fuel change which structs get optimized and which don't?

So make fuel part of the input.

Fuel changes the game in annoying ways anyway, similar to PGO actually, in taht adding new structs can change the layout of the existing ones. So if our definition is ready for PGO then fuel shouldn't cause extra problems, and if we decide we don't want to support PGO then our rule should say it only holds when fuel is not used. Both seem like reasonable options to me.

the8472 commented 6 years ago

Another option for determinism is hashing the fully qualified name of the struct and using that as input for layout. That way a struct could retain a its layout between builds but two different structs with the same members would end up with different layouts, thus discouraging punning between repr(Rust) type without additional annotations.

Gankra commented 6 years ago

@RalfJung no I'm saying I am fine with it being non-deterministic as long as the non-determinism is constrained to field-reordering, with an otherwise simple and deterministic (same as repr(C)) alignment/padding algorithm.

ChrisJefferson commented 6 years ago

Something which is worth separating is the behaviour of the compiler under default / "sensible" settings vs what is EVER allowed.

I can understand someone not wanting field randomisation in their build, so they can tune ordering and achieve greater performance.

However, do we want to ban field randomisation from ever being valid in a Rust implementation? Banning it from ever being valid feels like it would need an extremely good reason.

gnzlbg commented 6 years ago

Something which is worth separating is the behaviour of the compiler under default / "sensible" settings vs what is EVER allowed.

The goal is establishing what guarantees can unsafe code rely on - these guarantees must always hold; compiler settings cannot break these guarantee.

There does not seem to be a consensus about whether we want to allow Rust implementations to randomize the field order of repr(Rust) structs or not. If we want to support this use case, then do we want the compiler to be able to fully randomize things? (e.g. increasing struct size, alignment, etc?) Or do we want to add some constraints to the amount of randomization that we want to allow (e.g. a variation of @Gankro 's comment could be that "alignment and size shall not be larger than if the struct were repr(C)").

ChrisJefferson commented 6 years ago

With regards alignment, when using SSE instructions I often use the macro _MM_ALIGN16, which forces a struct to be 16-byte aligned, so it's contents can be efficiently loaded for use with SSE instructions. It would be nice if the compiler decided it wanted to use SSE on a struct it could add this by itself.

Restricting the size to be no bigger than repr(C) seems like a really tight restriction, as if a struct is already efficiently packed that strongly limits the options for re-ordering the members.

the8472 commented 6 years ago

@gnzlbg we have said a lot about why the compiler might want to change layout and why a dev might want stable layout for debuggability. But very little has been said why unsafe code would need some specific guarantee.

@rodrimati1992 brought one example, punning between the same generic type with different phantom generic arguments. Without an annotation signalling that intent this would be ok. Without such an annotation it would prevent the compiler from making different optimization choices for different uses of monomorphized types.

What other uses do we have? @Gankro never stated how unsafe code would benefit from his requests for example.

nikomatsakis commented 6 years ago

An observation. If we did the following:

For example, we might say that struct layout is a function of the struct's generic types and its substitutions, full stop

but also included compiler settings in that equation, then we could still randomize layout, but we would be restricted that two structs with the same definition must still wind up with the same layout (which might be in any order etc).

In any case, I am getting more attracted to the idea of pulling out certain special cases (e.g., single-field structs in #34) and leaving the broader question for later.

the8472 commented 6 years ago

but we would be restricted that two structs with the same definition must still wind up with the same layout (which might be in any order etc).

Which would preclude optimizations that take use of the structs into account.

nikomatsakis commented 6 years ago

@the8472 yes. I am ok with that, personally, but that is indeed one of the questions =)

the8472 commented 6 years ago

Imo no such guarantee should be given because it would make layout-compatibility part of the API. You could write your own struct that matches the fields of a struct in a different crate (maybe by looking under the hood for private ones) and then rely on compatibility. Which is inherently brittle.

If no such guarantee is given and the compiler is allowed to be actively hostile towards such uses (e.g. via randomization) that would require the use of (currently non-existent) annotations and compile-time checking to allow such things which is a lot safer.

If some good optimization potential is found later and precluded due to guarantees then everyone would have to opt in with every struct where they would want that to apply. It's much nicer to have things apply to all safe uses of repr(Rust) by default and unsafe users should opt into any special treatment that they need, not the other way around.

So yeah, I'm leaning on the side of no-guarantees maximalism here. Unless concrete, important unsafe use-cases that cannot be reduced to special cases are spelled out here.

arielb1 commented 5 years ago

Determinism between 2 structs with the same "definition", C-header-style, feels like an undesirable property to me. Both because of PGO and the likes, but also we'll have to include all the interesting properties of the definition somewhere (inhabitedness? that might rely on complicated factors).

Determinism of the "same" struct between 2 runs of the compiler also seems a bit hard to specify - we'll have to decide what it means for 2 runs of the compiler to be the same.

However, as a "quality-of-implementation" requirements:

The compiler should have a "reproducible" mode where "irrelevant" changes to the input don't change its output, as is required for reproducible builds.
The compiler should have a "stable" mode where "small" changes to the input don't make large changes to the output, for debugging.

But I don't think that the unsafe code guidelines should rely on such compiler flags - it just seems like a layering violation.

In any case, we do at least need some sort of layout determinism: maintain the same layout for datatypes that are subtypes of each-other (because otherwise subtyping won't work). We can add a similar sort of determinism between Foo<X> and Foo<Y> if X and Y are only used in "phantom" positions, which might be useful in unsafe code.

the8472 commented 5 years ago

@arielb1 I still think the phantom case would benefit from annotations that require identical layout. That way we could still benefit from different types -> different use sites -> different optimizations may apply.

Otherwise I agree that the reproducible build and debugging concerns are orthogonal to guarantees for unsafe code, especially when we want non-reproducable (randomized) builds for shaking out bugs. E.g. the OpenJDK Hotspot compiler has some debugging flags that apply allowed code reordering more aggressively even when they're not beneficial to make it easier to find races and optimizer bugs.

CAD97 commented 4 years ago

One subset I haven't seen mentioned yet is two instantiations of the same generic struct with layout-compatible types. E.g. if I have Container<T> (which does not use associated const data to T), then setting T to two layout-compatible types should produce two layout-compatible types.

This is a more specific subset of layout-compatibility for any #[repr(Rust)] with layout-compatible fields in the same order, as it only applies to instantiations of the exact same generic definition.

Of course, user code utilizing layout-compatibility for library types in a semver-compatible way requires both the language support and the library to guarantee in documentation the exact guaranteed layout properties.

(Because, to be explicitly clear, my above proposed guarantee does not give C<T> compatible with C<T'> in general, because the container could use associated types and/or consts to give differing layouts.)

RalfJung commented 4 years ago

The question is, what exactly is "layout-compatible"? For example, the niches would have to be equal. This means that Cell<T> and T are not "layout-compatible", and neither are MaybeUninit<T> and T.

CAD97 commented 4 years ago

Ah, I thought we had defined "layout-compatible" already 😅

For the minimal guarantee I suggest, #[repr(transparent)]-equivalent would still be enough to enable some useful patterns that aren't possible yet.

hanna-kruppe commented 4 years ago

repr(transparent) does not actually guarantee equal niches (Cell is transparent).

bjorn3 commented 4 years ago

Cell is a wrapper around UnsafeCell, which uses a special attribute to hide niches.

rodrimati1992 commented 4 years ago

For examples of types which are #[repr(transparent)] and add niches,there's the NonZero types and NonNull in the standard library. Example NonZero type: https://doc.rust-lang.org/std/num/struct.NonZeroU8.html

CAD97 commented 4 years ago

NonZero* and NonNull use #[rustc_layout_scalar_valid_range_start(1)] as well, so they could be distinguished in that manner. UnsafeCell is ~~"only" #[lang = "unsafe_cell"]~~ #[repr(no_niche)]. We could (separately from should) say the guarantee only applies when no niche-impacting attributes are used.

EDIT: I was looking straight at the attribute, how did I miss it.

RalfJung commented 4 years ago

UnsafeCell is also #[repr(no_niche)].

jonhoo commented 3 years ago

I recently came across a case where I specifically wanted to cast a SomeType<T> into a SomeType<ManuallyDrop<T>>, and in trying to figure out whether that was safe I found this discussion. I think that case works correctly in the compiler today, but I suspect that it's still an open discussion whether it should/will be guaranteed by the language to be okay (subject to the discussion here). I figured I'd just leave it as an additional use-case, and an indication that supporting that kind of cast would be nice :)

scottmcm commented 3 years ago

@jonhoo The most common form of that I see is things like expecting Vec<i32> and Vec<u32> to have the same layout, and I agree that there's no rule allowing that today. And in the general case it can't hold -- imagine if SomeType used associated types (potentially even via specialization) to hold a different field type depending on the type parameter.

digama0 commented 3 years ago

General case meaning what exactly? Ideally, we can get some promises about A<T> being layout compatible with A<U> for particular choices of T,U and all A (let's call this T and U being contextually compatible). If T and U don't define any associated types, is it possible for some unknown A to distinguish between them? I'm not sure exactly how specialization plays into this, but I can certainly imagine that C++ would let you do some trick using specialization to make a function is_T<X> which "evaluates" to true (or u8, say) if X is T and false (or ()) if X is anything else. Is that what you mean?

CAD97 commented 3 years ago

If T and U don't define any associated types, is it possible for some unknown A to distinguish between them?

Yes, because the crate defining A could also define a new trait and implement it for T and U, giving them associated types.

We could potentially say that for layout-compatible T and U, A<T> and A<U> are guaranteed layout-compatible if the layout of A does not depend on any generic information other than the layout of its generic parameters, but this is a very complicated property to actually define, and as I understand it, not even that simple to maintain in the face of PGO that's allowed to reorder fields.

thomcc commented 3 years ago

There are meaningful code-size reasons to want Vec<i32> to have the same layout as Vec<u32> too, although this depends on some deduplication occurring.

RalfJung commented 3 years ago

I recently came across a case where I specifically wanted to cast a SomeType into a SomeType<ManuallyDrop>, and in trying to figure out whether that was safe I found this discussion.

ManuallyDrop is repr(transparent), so in terms of compatibility of T and ManuallyDrop<T> it doesn't get much better. (However I am not sure if it being repr(transparent) is a stable guarantee of the standard library; that'd be a t-libs questron.)

As noted above, however, it is hard to say what this means under arbitrary wrapper types.

We could potentially say that for layout-compatible T and U, A and A are guaranteed layout-compatible if the layout of A does not depend on any generic information other than the layout of its generic parameters, but this is a very complicated property to actually define, and as I understand it, not even that simple to maintain in the face of PGO that's allowed to reorder fields.

What about making such a guarantee if A's type parameter has no trait bound? I guess with specialization that could still use associated types? So we'd have to basically syntactically exclude associated types then.

But indeed this would rule out PGO-induced field reordering. There aren't many layout guarantees that are compatible with such strong forms of PGO.

the8472 commented 3 years ago

ManuallyDrop is repr(transparent), so in terms of compatibility of T and ManuallyDrop it doesn't get much better.

repr(transparent) is not helpful though if you don't know its inner structure as that is non-public. I.e. you don't see what type it is transparent to (should be called translucent 😀). See https://github.com/rust-lang/unsafe-code-guidelines/issues/35#issuecomment-601224419

For a Wrapper we would at least need a repr(transparent(T)) annotation or better something like repr(transmutable_from(T)) to make it explicit. ~Plus the annotation is not shown in the documentation.~ Edit: it is, you have to click through twice.

Also, MaybeUninit is repr(transparent) too, and yet it is very much not transmutable within some wrapper types.

I still think making promises about transmutability of repr(Rust) in the general case is a bad idea. We already have a few special carveouts such as single-field or homogenous structs and maybe a few more can be added; but for the general case explicit annotations should be required.

RalfJung commented 3 years ago

Also, MaybeUninit is repr(transparent) too, and yet it is very much not transmutable within some wrapper types.

repr(transparent) on union is... different, indeed. That's why it is not stable yet.

We already have a few special carveouts such as single-field or homogenous structs and maybe a few more can be added; but for the general case explicit annotations should be required.

To be clear, those caveouts are proposals, not guarantees, at this stage.

Diggsey commented 3 years ago

IMO, layout compatibility between Generic<T> and Generic<U> should be a property that the author of Generic chooses, and the language should only facilitate that.

For example, we could have an attribute #[repr(parametric)] that the author could apply to generic parameters to indicate that representation is parametric with respect to that parameter.

elichai commented 3 years ago

IMO, layout compatibility between Generic<T> and Generic<U> should be a property that the author of Generic chooses, and the language should only facilitate that.

For example, we could have an attribute #[repr(parametric)] that the author could apply to generic parameters to indicate that representation is parametric with respect to that parameter.

Agreed, and about the repr, isn't that what safe transmute for? that way the author can make those kinds of guarantees

jonhoo commented 3 years ago

Yeah, I was thinking specifically of the case of casting in/away ManuallyDrop, which is a type that "we" completely control. The more general case is (as has been demonstrated above) more complicated.

RalfJung commented 3 years ago

"We" do not control SomeType though, so it might be doing associated type shenanigans.

jonhoo commented 3 years ago

Mmm, good point. It would be really nice to have some rules around whether this is ever safe though, as it is a very handy cast to be able to do in particularly "weird" unsafe code. In my case, I use it because I know that I'm operating on aliased values, and I essentially want to "turn" any drop of those values into a forget in some code that I do not control (e.g., HashMap::clear). If the cast is not guaranteed to be safe (perhaps subject to some preconditions), then I don't know if there is an efficient alternative for what I'm trying to do.