Open mahkoh opened 3 years ago
@mahkoh can you reproduce this without no_mangle
? no_mangle should really require unsafe, it can cause unsoundness if it overlaps with another linker symbol.
@jyn514 Yes
repr(C) has served two purposes:
The first purpose is advertised both in the reference and in stdlib code e.g. in Layout
. It is probably used in many other places.
The second purpose is also advertised in the reference.
However, these purposes are not compatible as shown above.
The layout algorithm is not the same across all targets, it is always supposed to be whatever the C ABI mandates on that particular target
Does this reproduce with clang?
The layout algorithm is not the same across all targets, it is always supposed to be whatever the C ABI mandates on that particular target
The layout algorithms used by the C compilers are not the same. But repr(C) is advertised with a specific layout algorithm that is the same across all targets. Namely in these places
Does this reproduce with clang?
Clang contains many bugs in their MSVC-compatible layout algorithm. It should not be used as a reference:
The output of clang is incompatible with both MSVC and rustc: https://github.com/llvm/llvm-project/blob/661f9e2a92302b1c7140528423fdbfc133a68b41/clang/lib/AST/RecordLayoutBuilder.cpp#L3076-L3087
Let's make sure the windows notification group is aware of this:
@rustbot ping windows
Hey Windows Group! This bug has been identified as a good "Windows candidate". In case it's useful, here are some instructions for tackling these sorts of bugs. Maybe take a look? Thanks! <3
cc @arlosi @danielframpton @gdr-at-ms @kennykerr @luqmana @lzybkr @nico-abram @retep998 @rylev @sivadeilra
Assigning P-high as discussed as part of the Prioritization Working Group procedure and removing I-prioritize.
So to summarize:
One more thing
#[repr(C, packed(1))]
struct X {
c: u8,
m: std::arch::x86_64::__m128,
}
#[no_mangle]
pub fn f() -> usize {
// Expected 32
// Actual 17
std::mem::size_of::<X>()
}
That is, if we assume repr(packed(1))
to have the same intended effect as #pragma pack(1)
.
Yikes, I think a lot of folks do assume that, yeah.
Basically all types that have a __declspec(align)
annotation (such as __m128) in the MSVC stdlib but no such annotation in Rust are broken because MSVC implements the concept of required alignments which are unaffected by #pragma pack annotations.
This particular problem could be fixed by adding such an annotation to m128 on MSVC targets. The definition of m128 is broken anyway because it has to be 16 byte aligned on MSVC targets but is defined as 4 byte aligned in the stdlib.
So in the end this is not an inherent problem of the layout algorithm because you're not supposed to be able to write this anyway.
C/C++ do not permit zero-sized structs or arrays. Aggregates (structures and classes) that have no members still have a non-zero size. MSVC, Clang, and GCC all have different extensions to control the behavior of zero-sized arrays.
It is unfortunate that #[repr(C)]
means two things: C ABI compatible, and sequential layout. Maybe a new #[repr(stable)]
could be added, which would request sequential layout but would not require interop with standard C ABI.
In the near term, perhaps the best thing is to add a new diagnostic, which is "you asked for #[repr(C)]
, but you have ZSTs in here, and that might be a problem."
C/C++ do not permit zero-sized structs or arrays. Aggregates (structures and classes) that have no members still have a non-zero size.
GCC and Clang do in fact accept even completely empty structs and unions and such types have size 0 when compiled with these compilers. (Except that Clang tries to emulate MSVC when compiling for *-msvc
targets.) The current implementation of repr(C) seems to be correct in all cases accepted by rustc except on msvc targets.
Maybe a new
#[repr(stable)]
could be added, which would request sequential layout but would not require interop with standard C ABI.
That seems to be the most pragmatic solution. Alternatively one could deprecate repr(C) completely and replace it by repr(NativeC) and repr(stable).
In the near term, perhaps the best thing is to add a new diagnostic, which is "you asked for
#[repr(C)]
, but you have ZSTs in here, and that might be a problem."
Maybe only warn on msvc targets.
GCC and Clang do in fact accept even completely empty structs and unions and such types have size 0 when compiled with these compilers.
In C++ they have size 1:
$ clang -std=c++20 -xc++ - -o /dev/null -c <<<'extern "C" { struct EmptyStruct {}; union EmptyUnion {}; }'
<stdin>:1:14: warning: empty struct has size 0 in C, size 1 in C++ [-Wextern-c-compat]
extern "C" { struct EmptyStruct {}; union EmptyUnion {}; }
^
<stdin>:1:37: warning: empty union has size 0 in C, size 1 in C++ [-Wextern-c-compat]
extern "C" { struct EmptyStruct {}; union EmptyUnion {}; }
^
2 warnings generated.
Actually the nomicon says:
ZSTs are still zero-sized, even though this is not a standard behavior in C, and is explicitly contrary to the behavior of an empty type in C++, which says they should still consume a byte of space.
So it seems that this (unsound) behavior is even documented.
In C++ they have size 1:
repr(C) is about C compatibility not about C++ compatibility.
So it seems that this (unsound) behavior is even documented.
Otherwise empty structs in msvc have size at least 4 bytes not 1 byte in C mode.
GCC and Clang do in fact accept even completely empty structs and unions and such types have size 0 when compiled with these compilers.
These are non-standard extensions, and they deviate from the C/C++ specification.
If #[repr(C)]
means "has a representation that is equivalent to that generated by a conformant C compiler", then Rust's current behavior is fine. If #[repr(C)]
means #[repr(C_with_msvc_and_clang_extensions)]
, then that is something different.
I understand the desire for compatibility with the de-facto standard behavior of these compilers, from a practical point of view. At the same time, these are areas where they do deviate from the language standard. Figuring out the best solution for Rust will require some careful consideration, and the short-term solution of "make it work like Clang / MSVC / GCC" should not be chosen without due consideration.
If
#[repr(C)]
means "has a representation that is equivalent to that generated by a conformant C compiler",
That is obviously not what it means. First of all, there is no such thing as the representation of a type when compiled with a conformant C compiler because the C specification does not specify the representation of types. Two conformant C compilers on the same OS can have different type layouts. Therefore, even if we only want to talk about the representation of types that have an equivalent in standard's compliant C, we have to first decide which ABI we are targeting. And this ABI is most of the time dependent on the compiler even on the same OS.
Second of all, Rust has for a long time supported non-standard extensions such as repr(packed).
Figuring out the best solution for Rust will require some careful consideration, and the short-term solution of "make it work like Clang / MSVC / GCC" should not be chosen without due consideration.
I hope you are not implying that I suggested that repr(C) should work like any one of these compilers. I said it should work like the native compiler on the target the Rust program is compiled for. Please explain why this would be controversial. This is also the approach taken by the Clang developers.
... there is no such thing as the the representation of a type when compiled with a conformant C compiler
The gray area I was pointing out was, when you ask a C compiler to compile non-conformant code, there is no possible well-defined behavior that corresponds to the specification.
Please explain why this would be controversial.
Because that line of reasoning boils down to "whatever a specific set of C implementations do". Specification-by-implementation is always a slippery slope. When you go down that path, you often wind up in a place where different implementations have conflicting behavior, and now you have no clear way to specify your own behavior.
If we look at the specific question of "what should #[repr(C)]
mean with a ZST?", then we have several options:
#[repr(C)]
as a way to interop with C++ (and there are many), this may save someone a lot of hassle, because the C++ side will compute sizeof(T) == 1
, not zero.#[repr(C, size = 0)]
. This may seem redundant, but it would inform the compiler that you know that you're asking for behavior that is non-standard, and you're giving it the information on how to interpret the non-standard behavior.#[repr(C)]
means compatibility with a specific C implementation, e.g. #[C_repr_compatibility = "clang")]
or some such. I'm intentionally using an ugly, bespoke syntax because I don't care about the syntax. I only care about the semantics -- that where the developer is using behavior that is not standard C, that they must specify (to the Rust compiler) what they need / expect.I am not criticizing your very reasonable desire to solve this issue. All I'm doing is pointing out that solving this requires more thinking than "do whatever Clang / GCC / MSVC" does. There are many other C implementations (such as those for embedded systems), and I do hope that Rust, as a language, does not specify its own behavior in terms of "do what some other language implementation does".
I think the best solution would be:
#[repr(C)]
, it issues a warning. This is exactly like what Clang, GCC, and MSVC do already when you attempt to compile zero-sized arrays in those languages.#[allow(nonstandard_C_extension)]
or #[repr(C, compatible_with = "clang")]
or some such. Tools such as bindgen
can then add these annotations, for exact compatibility with the compilers that Clang supports.The gray area I was pointing out was, when you ask a C compiler to compile non-conformant code, there is no possible well-defined behavior that corresponds to the specification.
Like I already explained, repr(C) is not about the C specification. It's about the ABI.
Because that line of reasoning boils down to "whatever a specific set of C implementations do". Specification-by-implementation is always a slippery slope.
This has nothing to do with non-standard extensions. The behavior of bit-fields, a standard C construct, differs widely between targets and compilers and is essentially only defined by its implementation.
If we look at the specific question of "what should
#[repr(C)]
mean with a ZST?", then we have several options:
You've omitted the option I've suggested. Namely to conform to the behavior of the native OS compiler. This is the option taken by Clang with overwhelming success, allowing it to be used as the C compiler on all major platforms with a very high degree of interoperability [1] [2] [3]. Explain why this does not work for Rust and why it would not lead to equal success.
Reject it as invalid, because this is invalid C.
We do not care about valid C. We care about making Rust usable in as many places as possible. In particular place that are currently dominated by C and C++. To achieve this goal we have to be able to easily interact with code written in C and C++. Such code uses non-standard extensions.
Your other suggested options all restrict the usability of Rust in these places.
There are many other C implementations (such as those for embedded systems)
Your argument boils down to "on the same target, there can be multiple incompatible C implementations". But this has nothing at all to do with non-standard extensions such a ZSTs. What you are saying is that we cannot have repr(C) at all because there is no authorative definition of the layout of C types on any platform. This goes contrary to what Herb Sutter said in N4028 where he basically follows my argument by saying
Because the ABI is per “platform,” it is the OS platform owner, not every C++ implementation, who is responsible for defining what the C++ ABI means on its platform. For example, for the platform “Windows x86 32-bit,” Microsoft’s Windows team (possibly delegating to their native compiler team) would be ultimately responsible for specifying the Windows C++ ABI.
[1] https://docs.microsoft.com/en-us/cpp/build/clang-support-msbuild?view=msvc-160 [2] https://www.phoronix.com/scan.php?page=news_item&px=OpenMandrova-Clang-Euro-LLVM [3] https://www.kernel.org/doc/html/latest/kbuild/llvm.html
If we follow your argument and solutions, then it will not be possible to use Rust in the Linux kernel which makes significant use of GCC extensions. As does the userspace API, though to a lesser degree. The MSVC standard library also makes use of extensions:
~/opt/msvc$ rg 'declspec\(align' | wc -l
64
~/opt/msvc$ rg 'pragma pack' | wc -l
966
So no more winapi for Rust.
My "argument" is simply that we should help the developer get the behavior they were expecting. Since ZSTs are a non-standard part of C, it is reasonable to ask the developer for more information -- basically, confirming what behavior they want.
You are attributing strawman arguments (such as "let's break Rust in Linux kernel") to me that I have not proposed, which is not helping. I am not proposing anything that prevents using Rust in some particular scenario, such as the Linux kernel. I am trying to work through the problem from first principles, and I don't think we are actually in any sort of fundamental disagreement. I would ask you for a little more patience while we work through these issues; I am not attacking you or trying to propose something absurd. The only major difference seems to be that I am focusing more on the C specification than on the de-facto standards implied by the implementations.
From my experience, it's good to be cautious about designing language features and semantics. Often, you don't get a chance to correct a mistake, so it's worth it to move carefully on things like this and avoid mistakes. If that takes some conversation to converge on a solution, that seems fine to me.
You responded to almost everything that I posted except for my suggestions at the bottom. What do you think of my suggestions, of 1) generating a warning when #[repr(C)]
is applied to ZSTs, and 2) providing Rust attribute that specify the behavior that is wanted?
One situation I am worried about, is that C and C++ differ in what sizeof
means for a struct that has no fields. For the exact same code, GCC and G++ give different answers. (Clang gives the same answers.) MSVC (in C mode) rejects with a hard error structs that have no fields, while MSVC (in C++ mode) generates a struct of size 1. Since the same compilers give different answers even when targeting the same platforms, I think there is a good reason to flag this same struct definition with a warning in Rust, and require the user to specify what behavior they want.
My "argument" is simply that we should help the developer get the behavior they were expecting. Since ZSTs are a non-standard part of C, it is reasonable to ask the developer for more information -- basically, confirming what behavior they want.
Why only for non-standard C? The behavior of GCC and MSVC is significantly different even for standard C.
You are attributing strawman arguments (such as "let's break Rust in Linux kernel") to me that I have not proposed, which is not helping.
It is quite distressing that I open this issue to increase the compatibility of Rust with MSVC and you're floating the idea that we could instead solve the problem by decreasing the compatibility of Rust with all other platforms by disallowing certain constructs that are currently broken only on MSVC targets.
I am not proposing anything that prevents using Rust in some particular scenario, such as the Linux kernel.
The options and solutions you've proposed do that.
Reject it as invalid, because this is invalid C.
This obviously would prevent using Rust in the Linux kernel. Not necessarily because of empty structs but because of other constructs that fall in the same category of non-standard C.
The warning should indicate that Rust will still treat this as a ZST.
This makes Rust incompatible with C code compiled with MSVC.
Define a Rust attribute that specifies the size, e.g.
#[repr(C, size = 0)]
.
Ditto.
Define a Rust attribute that specifies, within a particular scope (such as a module), that
#[repr(C)]
means compatibility with a specific C implementation, e.g.#[C_repr_compatibility = "clang")]
Ditto. Or if we instead write #[C_repr_compatibility = "msvc")]
, then it will be unsound on non-msvc targets.
Meanwhile you've ignored the option that would create the highest degree of compatibility: Do what the C compiler that compiled the C code that we are trying to interact with does.
You responded to almost everything that I posted except for my suggestions at the bottom. What do you think of my suggestions, of 1) generating a warning when
#[repr(C)]
is applied to ZSTs, and 2) providing Rust attribute that specify the behavior that is wanted?
1) I do not believe a warning is necessary. Simply follow the behavior of the native OS compiler.
2) That would make the Rust code platform dependent. If I have C code that compiles with both MSVC and GCC, then Rust code that also compiles on both platforms must be able to soundly interface with the C code.
One situation I am worried about, is that C and C++ differ in what
sizeof
means for a struct that has no fields.
There would be no problem with adding repr(C_plus_plus) if only to handle this one case.
It is quite distressing that I open this issue to increase the compatibility of Rust with MSVC and you're floating the idea that we could instead solve the problem by decreasing the compatibility of Rust with all other platforms by disallowing certain constructs that are currently broken only on MSVC targets.
It is common (when investigating issues) to enumerate options precisely so that we can identify which ones are bad ones, and which ones are good ones. Listing options is not the same thing as endorsing them.
This [using
#[repr(C, size = 0)]
makes Rust incompatible with C code compiled with MSVC.
It doesn't even matter with MSVC, because MSVC (in C mode) rejects the program entirely, because it is not conformant C code. With this code:
#include <stdio.h>
struct EmptyStruct {
};
int EmptyArray[0];
int main() {
printf("size of empty struct = %d\n", (int)sizeof(struct EmptyStruct));
printf("size of empty array = %d\n", (int)sizeof(EmptyArray));
}
I get this:
test.c(4): error C2016: C requires that a struct or union have at least one member
test.c(6): error C2466: cannot allocate an array of constant size 0
You say we should "do what the compiler does for that platform". In this specific case, there is no single behavior from the compilers. MSVC rejects the program entirely. Should Rust also reject the program entirely? Clang and GCC do something different from MSVC, so how do we follow your own suggestion and do what the compilers do?
If you mean that Rust should do what a specific compiler does when targeting a specific platform, then, which one? And how do you select that behavior?
It doesn't even matter with MSVC, because MSVC (in C mode) rejects the program entirely, because it is not conformant C code. With this code:
For empty structs it does but recall the example from the OP: https://godbolt.org/z/jzrasb
For constructs that are rejected by the C compiler there is no problem because we will not be interfacing with such C code. In this case we can emit a warning and leave the layout unspecified.
If you mean that Rust should do what a specific compiler does when targeting a specific platform, then, which one? And how do you select that behavior?
I've compiled such a list here: https://github.com/mahkoh/rfcs/blob/bitfield/text/3064-bitfield.md#appendix-a
On non-windows platforms there is not much of a problem because the differences between Clang and GCC are mostly bugs in Clang. The only interesting case are the -windows-gnu targets. Here we might want to add an annotation that opts into MSVC layout so that people can interact with MSVC libraries.
Thank you @sivadeilra and @mahkoh so much for the discussion and for providing analysis and options for a clearer evaluation. Just chiming in to remind that unsound issues are definitively regarded with high priority and this one will probably need a bit of thought on how to be tackled.
labeling this for T-compiler
.
While I agree that soundness issues are critical, let's also keep in mind that this has been the behavior since before 1.0. I don't think there is a need for any short-term solutions.
@mahkoh can you explain why you think this is unsound? Rustc certainly differs from MSVC in the layout, but as far as I can tell that can only go wrong if you use unsafe code and assume the layout is the same for both. I don't know if rust guarentees that for ZSTs.
and assume the layout is the same for both. I don't know if rust guarentees that for ZSTs.
Can you point me to where the guarantee is that repr(C) structs have the same layout as structs in C?
Minor Note: This is actually a T-lang issue to resolve before it's a T-compiler issue.
T-lang really needs to specify how repr(C) works for all compilers of rust, and then the compiler team can update rustc. But this needs to be an official language decision.
and assume the layout is the same for both. I don't know if rust guarentees that for ZSTs.
Can you point me to where the guarantee is that repr(C) structs have the same layout as structs in C?
Well right, that's sort of my point - if rust doesn't guarantee that, how can having a different layout be unsound?
If rust makes no guarantee about the compatibility of repr(C) and C layouts, and you have not linked any such guarantees, then the C in repr(C) is rather misleading and should be renamed to repr(simple) or something like that. If this is what you are arguing then there is no unsoundness here.
If rust makes no guarantee about the compatibility of repr(C) and C layouts
I think the statement that there's no guarantee applies only to zero-sized types. And in part helps to allow things like:
#[repr(C)]
struct Foo<'a> {
some: Stuff,
etc: *const Blah,
marker: PhantomData<&'a ()>,
}
It's pretty clearly a bug otherwise, though.
A fixed layout can be selected with the #[repr] attribute … [
#[repr(C)]
example] … This will force a struct to be laid out like the equivalent definition in C. — RFC79The
C
representation is designed for dual purposes. One purpose is for creating types that are interoperable with the C Language. The second purpose is to create types that you can soundly perform operations on that rely on data layout such as reinterpreting values as a different type. — Type Layout - The Rust Reference
Both of these only refer to the C language, by my reading of the C spec most of the layout details for a struct
are then implementation defined, a compliant compiler could sort members alphabetically before layout use a members name to determine padding around it; so it really does seem like Rust needs to either:
repr(C)
is only compatible with C compilers using the same algorithm as specified in the reference (which seems to exclude MSVC when zero-length-array extensions are in use)A fixed layout can be selected with the #[repr] attribute … [
#[repr(C)]
example] … This will force a struct to be laid out like the equivalent definition in C. — RFC79
This RFC does not mention the algorithm in the reference. Where does this algorithm come from? Who decided to add this algorithm to the reference? Where is the corresponding RFC?
Apparently this was added in https://github.com/rust-lang/reference/pull/156 as part of https://github.com/rust-lang/rfcs/pull/2195. But this RFC does not at all talk about the layout of repr(C) structs or unions and the reference PR goes much further than what was discussed in the RFC. So if only the correct procedure had been followed, we would not find ourselves in this situation.
Mahkoh all the correct procedure has been followed.
The reference doesn't prescribe the operation of rustc it just reflects it. Even if the reference said nothing at all there would be some algorithm being performed within the compiler. The reference is just describing that.
Then, in your opinion, it would do no harm to add a comment to the Rust reference that
repr(Rust)
structs is the same as the layout of repr(C)
structs.&dyn Trait
is (data_ptr, v_table_ptr)
&[T]
is (data_ptr, size)
After all, that is how rustc behaves and the reference has no normative function.
Well point 1 would be a very wrong thing to say
2 and 3 is technically a possible thing to say without it being inaccurate, but I think that the Reference maintainers would reject such a PR because there's not really much that a person can do with that knowledge. It would still be UB to, for example, directly transmute a &[i32]
to &[f32]
.
The reference is not now, and likely it will not ever be, normative. It's a documentation of the language that inevitably lags behind the language, sometimes quite a lot behind.
2 and 3 is technically a possible thing to say without it being inaccurate, but I think that the Reference maintainers would reject such a PR because there's not really much that a person can do with that knowledge.
Of course it would be useful, for example, you could pass &[i32]
to a C function accepting an equivalent C type.
@DianaNites I think the point was technically &[i32]
and &[f32]
are allowed to have different layouts (i.e swap the positions of the length and the pointer) such that transmuting between them would be unsound (without going through into_raw_parts and from_raw_parts)
@DianaNites
Invoking undefined behavior via compiler intrinsics.
transmute is an intrinsic, and code can't rely on the layout of a repr rust struct being stable. in practice some layouts are stable, but you can't rely on that. eg: all code from before the field reordering update trying to guess the layout of a rust struct would just be wrong with a modern compiler.
also note, very important:
Warning: The following list is not exhaustive.
any number of things can be UB without being on that list.
@mahkoh the layout of a slice is not actually stable. I think you already know this. the most the reference could say is that it's a specific way for a given version of the compiler. Not that it will always be that way. There's a reason that slice layout is in the (far less worked on) unsafe code guidelines.
@mahkoh the layout of a slice is not actually stable. the most the reference could say is that it's a specific way for a given version of the compiler.
Since the reference does not say that repr(C) as described in the reference is only valid for the current version of the compiler, then, following your argument, this must mean that repr(C) as described in the reference is stable. Is that what you are saying?
I've been preparing an RFC to address this. It's a work in progress but feel free to comment. https://gist.github.com/mahkoh/eb95f518a8a4b25237ffd2edd2730c6b
Yes, repr C is stable. Yes, changing how it works would be a breaking change. Neither of those points are up for debate, that's just the facts. However, breaking changes are allowed without a semver bump when it's related to soundness issues.
What is left to discuss is how we can fix things with as small of a breaking change as possible.
Yes, repr C is stable.
Where was it decided that the algorithm in the reference is the stable algorithm instead of having the algorithm depend on the platform?
Again, I would really like to stress this point: the reference isn't the source of truth. The reference is just describing what the compiler does (as best it can). When the reference and the compiler disagree, unless T-compiler says that it's a bug, then the reference is wrong.
Again, I would really like to stress this point: the reference isn't the source of truth. The reference is just describing what the compiler does (as best it can).
You're contradicting yourself. Previously you said that the layout algorithm of &[T]
cannot be added to the reference because it is not stable. Yet here you are implying that it can be added to the reference because the reference merely describes what the compiler does.
Also see this summary below.
Consider
and the corresponding MSVC output: https://godbolt.org/z/csv4qc
The behavior of MSVC is described here as far as it is known to me: https://github.com/mahkoh/repr-c/blob/a04e931b67eed500aea672587492bd7335ea549d/repc/impl/src/builder/msvc.rs#L215-L236