rust-lang / unsafe-code-guidelines

Forum for discussion about what unsafe code can and can't do
https://rust-lang.github.io/unsafe-code-guidelines
Apache License 2.0
666 stars 58 forks source link

Representation of bool, integers and floating points #9

Closed nikomatsakis closed 5 years ago

nikomatsakis commented 6 years ago

This issue is to discuss the memory layout for integral and floating point types:

For the most part, these are relatively uncontroversial. However, there are some interesting things worth discussing:

joshtriplett commented 6 years ago

Don't forget u128 and i128.

Another topic: Does FFI code need to use types like libc::uint64_t, or is it safe to just use u64?

nikomatsakis commented 6 years ago

@joshtriplett

Don't forget u128 and i128.

Edited, thanks!

Another topic: Does FFI code need to use types like libc::uint64_t, or is it safe to just use u64?

Good question! I presume it does not, but I'd be curious if there is another side to the discussion.

*Related note: clearly usize and C's unsized are not equivalent, but it's worth stating this explicitly.

Gankra commented 6 years ago

Another topic: Does FFI code need to use types like c_uint64, or is it safe to just use u64?

Much of the rust ecosystem (e.g. webrender and therefore firefox) assumes uint64_t and u64 are the same ABI-wise, and I'm unaware of any reason to prevent that assumption.

How is usize intended to be defined on various platforms?

I believe pointer size is the correct definition but I haven't read the link, so grain of salt

Rust currently states that the maximum size of any single value must fit in with isize

There's no good reason, it's just because llvm has a quirky definition of in-bounds pointer calculations

Do we want to discuss signaling NaN at all? Specifically: why is it potentially of concern, and are there things that unsafe authors or other folks need to be aware of?

Signaling NaNs only merit discussion insofar as the IEEE spec defines some random operations to act differently on them. (e.g. max(sNaN, x) != max(qNaN, x), although iirc this example is regarded as a mistake and is intended to be changed)

Signaling itself is, I think, largely a failed experiment and worth ignoring (soft cc on @stephentyrone in case i'm misremembering)

hanna-kruppe commented 6 years ago

Wrt signaling NaNs, it's more of a question for the (now deferred, cc #8) discussion of valid values. There's a persistent rumor (including, at times, among LLVM contributors) that handling an sNaN or doing certain operations on it will cause a trap or is undefined behavior in LLVM. This is not the case, but I've encountered enough people thinking it's true that I think it would be best explicitly state that signaling NaN are perfectly fine, and thus that all bit patterns are valid floats.

gnzlbg commented 6 years ago

+1 From the point-of-view of just layout, SNaNs are not really that interesting and the easiest thing is to just allow them. AFAIK f32::from_bits(u32) is safe and stable and works for all bit patterns, so we can't really do much about this anyways without potentially breaking some code.

nikomatsakis commented 6 years ago

@Gankro

I believe pointer size is the correct definition but I haven't read the link, so grain of salt

The link is to a comment from @gnzlbg and states:

C++ says that usize is an unsigned integer type that can store the maximum size (as returned by mem::size_of<T>/size_of_val/etc.) of a theoretically possible object of any type (including arrays). A type whose size cannot be represented by usize is ill-formed. On many platforms (an exception is systems with segmented addressing) usize can safely store the value of any non-member pointer. In those platforms, usize is a type capable of holding a pointer.

nikomatsakis commented 6 years ago

@Gankro

There's no good reason, it's just because llvm has a quirky definition of in-bounds pointer calculations

Do you think we should write down that this is something that is presently true but which may be changed in the future (so unsafe code should not rely on it being true)? It seems like it might affect quite a bit how one writes code.

nikomatsakis commented 6 years ago

(In particular, it seems to imply that it is safe to use isize for "pointer offset" within any one value, which is otherwise not necessarily true, right?)

Gankra commented 6 years ago

I don't think we can ever change it since it's baked into ptr::offset. If we did it would be in a way where negative offsets were a valid very-large-positive offset, so isize would still "work" but be weird.

Gankra commented 6 years ago

also fwiw I think gcc also gets sad with huge offsets

gnzlbg commented 6 years ago

Another topic: Does FFI code need to use types like libc::uint64_t, or is it safe to just use u64?

I think that the bare minimal guarantee here is that the Rust extern "C" function declarations need to use types that match in size and alignment with the types of the C function declaration.

That is, if C uses uint64_t, then you can use u64, libc::uint64_t, or even i64. In particular, libc types are not special. When C uses unsigned, then you need to use a 32 or 64 bit type (or something else) depending on the platform you are targeting. The libc::uint type does this correctly for you, but you don't have to use that.

That would be the bare minimum, and I think that would already be ok since we are just passing bags of bytes here and it is all unsafe anyways. @mw might know whether this can result in any issues due to, e.g., cross-language inlining.

If we wanted to extend this minimum, we could map the C types to the Rust types, e.g. saying that if a C's function declaration uses uint64_t then Rust extern "C" declaration must use a 64-bit wide unsigned integer type. Or if it uses unsigned that the Rust extern "C" declaration must use an unsigned integer type of the same width. But this opens many questions, e.g., is struct A(u64) a 64-bit unsigned integer type that I can use where C uses uint64_t ? What if I apply repr(transparent) to it? I'd rather avoid all this.

I think if we can get by with only the size and alignment requirements, we should. If someone then uses a &T as the return type of an extern "C" function, and the function happens to create an invalid value, then that's UB but that would be covered by a different part of the unsafe code guidelines.

Gankra commented 6 years ago

size and alignment aren't sufficient for ABI. The entire reason we have repr(transparent) is because the calling convention for void foo(u64) and void foo(NewTypedU64) are sometimes different (i.e. x86 CC's may pass the former as a register and the latter on the stack).

Similarly the CC for passing struct Foo(u32, u32) by-value isn't always the same as struct Foo(u32, u16, u16) (iirc some x64 CCs spec that homogeneous composites get passed in SIMD registers)

size and alignment are only sufficient if you're passing by-reference (and copying the value out manually in the callee).

I believe you need to know:

For all of these u64 and uint64_t match perfectly

Gankra commented 6 years ago

I vaguely recall intending to tell the reference folks that they should explicitly distinguish layout (size+align+field offsets) and abi (layout + primitive-ness + homogeneousness).

Compatible layouts are sufficient to make type punning tricks work with transmute/pointers, but compatible ABIs are necessary for correctly passing by-value across the FFI boundary.

asajeffrey commented 6 years ago

One thing we might want to think about is whether the Rust semantics of base types needs any shadow state, e.g. provenance information. IIUC C does :/ since pointers in C have provenance, which is expected to be maintained by casts to/from usize.

Gankra commented 6 years ago

Could you say more on why we would need that? segmented architectures?

(nb I believe miri maintains provenance to avoid smuggling illegal pointer ops at compile time)

On Wed, Sep 5, 2018 at 2:29 PM Alan Jeffrey notifications@github.com wrote:

One thing we might want to think about is whether the Rust semantics of base types needs any shadow state, e.g. provenance information. IIUC C does :/ since pointers in C have provenance, which is expected to be maintained by casts to/from usize.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-rfcs/unsafe-code-guidelines/issues/9#issuecomment-418833293, or mute the thread https://github.com/notifications/unsubscribe-auth/ABFY4DHiY3ZEWrphlDCOd-DEAm5-BJJxks5uYBf2gaJpZM4WT3Y3 .

asajeffrey commented 6 years ago

@Gankro C interop mainly. IIUC in C, casting a *T to a usize and then back to a *T is a no-op, even though the *T is carrying provenance, which is why usize also carries provenance. Not sure whether we want this in Rust though, it would be nice if (e.g.) the semantics of u64 was just 64 bits, without having to track shadow state.

gnzlbg commented 6 years ago

@asajeffrey I think @RalfJung post (https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html) might say that yes, we need to track provenance when casting to integers and back, and that just because two pointers have the same numeric value when interpreted as an usize does not mean that they are interchangeable. Whether this implies that two usizes that have the same numeric value are interchangeable when casting them to a pointer... I don't know. I would expect that for these usizes, where they come from is important as well.

hanna-kruppe commented 6 years ago

I don't see what this has to do with the memory layout of primitives. Whatever model we choose has to allow implementing Rust with pointers being mere addresses, just as C can be implemented that way. Further state might be needed to determine whether an execution is UB or not, but that's

  1. a consequence of the pointer aliasing rules and the like, with no relation to runtime memory layout
  2. only relevant for formal models of the language and sanitizers like miri
asajeffrey commented 6 years ago

@rkruppe Fair enough, if we're tabling what the semantics of primitives is for the moment, as long as people are aware that there might be more to the semantics of primitives than just their memory layout.

alercah commented 6 years ago

The obvious question when talking about interacting with native C ABIs is what about platforms where CHAR_BIT > 8? I'm pretty sure the correct answer is "we don't support them, and we are not designing the language around the possibility", but that's important to decide still.

Prior discussions ([#46156][], [#46176][]) documented bool as a single byte that is either 0 or 1.

My reading of the C standard does not agree that this is the correct ABI. I went into it in detail on Zulip, but I believe that it is possible for a _Bool to only use, say, the second bit of a byte in determining whether it is 0 or 1, which would correspond to u8 0 and 2. C++ is slightly more vague on this point, but indicates that it tries to defer to C.

(Note that this discussion also relates to rust-lang/rfcs#992.)

Gankra commented 6 years ago

Yes, I believe that we don't care about:

and almost certainly don't care about:

I expect we don't care about a platform with weird bools, but I didn't follow that RFC so idk

joshtriplett commented 6 years ago

@Gankro

non-octet-byte platforms

Agreed.

segmented architectures

I don't think there's any fundamental reason not to support architectures that, for instance, distinguish between code and data memory.

non-zero-null platforms

We do need to support platforms that have real memory at 0, though writing to that memory might require some care. But yeah, we don't need to support platforms where NULL isn't a zero pointer.

non-two's-complement architectures

Agreed.

non-IEEE-float platforms (although these are pseudo-supported by just disabling floats)

f32 and f64 should certainly refer to IEEE floats. We might in the future need to support other floating-point formats, such as bfloat16, though those should have different types.

128-bit platforms (vaporware or very niche afaict)

We shouldn't make any design decisions that would absolutely rule them out in the (distant) future, though.

gnzlbg commented 6 years ago

Yes, I believe that we don't care about:

Many comments in a recent article in hack a day where complaining about how Rust is not a language that they can consider for their applications because it can't target X.

If we make it impossible to support these, we are making room for languages lower-level than Rust, but higher-level than assembly (e.g. C and C++ which support most of these).

I'm not saying we have to support all of these, but I'd be more comfortable knowing exactly which hardware Rust will never be able to target because of these decisions.

hanna-kruppe commented 6 years ago

Note that C++20 will likely specify it two's complement as the representation of signed integers and rule out other representations like sign-magnitude or one's complement (http://wg21.link/p0907). Apparently the C standard committee is inclined to do the same (https://twitter.com/jfbastien/status/989242576598327296).

A more general point regarding extremely niche implementation choices such as non-octet-bytes or NULL-at-nonzero-address: people are going to write code that relies on assumptions that are true on every platform they have ever heard of, and for good reason, as it simplifies their code at effectively no loss of portability. We can't prevent that, nor should we IMO, at most we could tell these people they are relying on implementation-defined behavior, which just makes it a de facto standard rather than a de jure one. The only benefit for those who port Rust to such oddball architectures is the reassurance that their port is technically conforming to "the Rust(tm) language" rather than technically being an extremely close dialect of it, but it won't change the fact that they can't run a ton of real Rust code without auditing it and removing these hard-coded assumptions. So I do not worry very much about accomodating architectural choices that deviate from the overwhelming consensus of today's platforms.

This of course assuming there is such an overwhelming consensus, thus I agree with the need for a survey that @gnzlbg raised.

asajeffrey commented 6 years ago

@Gankro For segmented architectures, WASM may end up with a memory architecture that distinguishes between shared- and non-shared memory. Many systems already do this for processes, WASM may end up doing this for threads too. Not sure how this will play with APIs like Rust mutexes.

gnzlbg commented 6 years ago

The only benefit for those who port Rust to such oddball architectures is the reassurance that their port is technically conforming to "the Rust(tm) language" rather than technically being an extremely close dialect of it, but it won't change the fact that they can't run a ton of real Rust code without auditing it and removing these hard-coded assumptions

@Gankro mentioned "segmented architectures". There are many 16-bit Intel CPUs like the 8086 that need segmented memory, people like to hack on, and LLVM can target (x86 in 16-bit mode).

Whether a Rust dialect for targeting the 8086 might be easy to create and closely resemble Rust, or not end up looking like Rust at all, will depend on which choices we make here.

This of course assuming there is such an overwhelming consensus, thus I agree with the need for a survey that @gnzlbg raised.

I think it might also be worth it to survey how hard would it be to support some of the things @Gankro mentioned implementation wise and from the language complexity perspective, and compare that to the hardware that they would enable targeting. For most of them I'd guess its probably: "very hard to implement", "significantly complicates the language", "allow us to target almost no new hardware". But for some of them like "segmented architectures" it might be "not that hard to support", "does not significantly complicate the language", and "enables a lot of hardware".

In particular, the decisions here don't have to be black and white (have feature => support hardware vs no feature => no hardware support). It might be interesting to consider an extra constraint where we don't have the feature in Rust, but this is done in such a way that creating a Rust dialect (e.g. via a nightly feature) that still resembles Rust, and can target more esoteric hardware, remains possible.

hanna-kruppe commented 6 years ago

I think it might also be worth it to survey how hard would it be to support some of the things @Gankro mentioned implementation wise and from the language complexity perspective

This is not the best/sole cost metric. In many cases the language complexity boils down to "we do not give a guarantee we would otherwise like to give and add some weasel words here and there to fix the holes that this leaves behind". And for primitive operations like integer/float arithmetic or memory accesses, the implementation complexity is limited (since most of the difference is in the hardware, not in the toolchain). The bulk of the complexity cost is carried by (third-party and rustup-distributed) libraries that want to be 100% conforming/portable. This is also called out in the C++ paper about two's complement linked earlier.

Gankra commented 6 years ago

I'll be blunt, I just rattled off the weird legacy platform properties that make C weird that I remember off my head, but I can't exactly remember why segmented architectures make C weird? Making it undefined to pointer offset between two different allocations?

gnzlbg commented 6 years ago

Making it undefined to pointer offset between two different allocations?

I think provenance makes this UB in C and C++, and probably Rust.

why segmented architectures make C weird?

This describes how it works in a nutshell. I don't think one can write ISO C without any compiler extensions to target these, at least for some of the cases.

Gankra commented 6 years ago

I believe the current definition of wrapping_offset permits jumping between allocations; or at least fails to forbid it.

On Mon, Sep 10, 2018 at 3:00 PM gnzlbg notifications@github.com wrote:

Making it undefined to pointer offset between two different allocations?

This is UB in C, C++, and probably Rust.

why segmented architectures make C weird?

This describes how it works in a nutshell https://retrocomputing.stackexchange.com/a/2374. I don't think one can write ISO C without any compiler extensions to target these, at least for some of the cases.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rust-rfcs/unsafe-code-guidelines/issues/9#issuecomment-420023574, or mute the thread https://github.com/notifications/unsubscribe-auth/ABFY4Edkd8KJBzcZyPzd9UQ9iapE1H4Gks5uZrbKgaJpZM4WT3Y3 .

RalfJung commented 6 years ago

@Gankro that would be https://github.com/rust-lang/rust/issues/45719, which was resolved (FWIW) by https://github.com/rust-lang/rust/pull/52668 by updating the docs.

RalfJung commented 6 years ago

I think @RalfJung post (ralfj.de/blog/2018/07/24/pointers-and-bytes.html) might say that yes, we need to track provenance when casting to integers and back, and that just because two pointers have the same numeric value when interpreted as an usize does not mean that they are interchangeable.

No, you misunderstood. Pointers have provenance. Integers do not. That would be fundamentally incompatible with large parts of GVN and many arithmetic operations.

This makes int-to-ptr casts interesting because they have to "fake" a provenance, and LLVM gets that wrong (and all the other compilers have similar bugs).

But anyway, that's not very relevant to this discussion I think.

asajeffrey commented 6 years ago

@RalfJung I think the issue of whether integers carry provenance is a tricky one. My understanding is that this is unspecified in C11, that there are different proposals for how to address this in C2x, and that the current state of the art is http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2263.htm#pointer-provenance-via-integer-types.

But yes, this is an issue about the semantics of pointers, not their representation.

RalfJung commented 6 years ago

My understanding is that no optimizing compiler makes integers carry provenance because if they did, half of their optimizations become unsound. Almost all of GVN relies on the assumption that equal integers are exchangeable. So from all I know, that question is pretty much settled, practically speaking. If LLVM and GCC (and MSVC and ICC) agree in the behavior of their IRs, TBH I do not think it matters very much what the C standard says. Also, personally, I think breaking integers like that is a rather ludicrous proposal. I mean we can hardly explain to people that pointers are funny that way, how can we expect to ever be able to do that for integers? And this time I'd be entirely on their side -- IMO people proposing to equip integers with provenance have not just drunken but entirely drowned in the "the only thing that matters is performance" kool-aid.

But to be fair, I have not read those new documents carefully yet; they are on my list. But in line with my declaration of sanity in programming languages, I intend to push back against integers being anything more than integers. ;)

asajeffrey commented 6 years ago

@RalfJung personally I agree with you, but if C2x ends up putting provenance on every byte then we're going to need an interop story.

briansmith commented 6 years ago

Yes, I believe that we don't care about:

  • non-octet-byte platforms

[...]

Whatever platforms are considered out-of-scope should be documented as such. In particular, I would like to see the documentation explicitly state that architectures (e.g. some DSPs) where bool is larger than one byte are not supported. I think it would be good to have the rustc code check for such preconditions for a target and bail when they are violated.

briansmith commented 6 years ago

The problem I have seen with bool from an ABI perspective is that while Rust defines it to be a single byte either 0 or 1, attempts to document that bool is guaranteed to interoperate with C _Bool through the FFI were rejected, yet bool is still allowed in FFI declarations without warning. In particular, there's no documentation of the extern "C" calling convention for bool arguments or return values.

It seems this issue is a generalization of that to all integer types.

IMO, it is important to define the Rust types as being exactly equivalent to the corresponding ISO C types in all respects, because ABIs are generally documented in terms of C types and not Rust types. For the cases where Rust enforces constraints beyond ISO C, those constraints can be documented as part of that mapping. Then we can state that for the case of extern "C" and #[repr(C)], rustc generates code that conforms to the ABI according to the documented mapping of types, regardless of what the ABI is, as long as the ABI conforms to Rust's additional constraints.

I also recommend deprecating the libc types that would be (arguably already are) redundant, or at least discouraging their use.

Gankra commented 6 years ago

Are DSPs the only notable platform where the rust definition of bool doesn't incidentally match the platform's C defintion?

(that seems believable and acceptable, since DSPs are, as I understand it, a common source of things we don't want to support)

sfackler commented 6 years ago

We do guarantee bool is C-compatible - see https://github.com/rust-lang/rust/pull/46176 and https://github.com/rust-lang/rust/pull/46156.

briansmith commented 6 years ago

We do guarantee bool is C-compatible - see rust-lang/rust#46176 and rust-lang/rust#46156.

I read rust-lang/rust#46176 and I understand that it was decided to not reject use of bool in FFI declarations since it would not be a backward-compatible change. I read rust-lang/rust#46156 and I understand it documents the size of bool but doesn't address any other issues that would make it compatible or incompatible with C's _Bool type, especially ABI considerations beyond size and the values of true and false, like alignment and padding and how they fit into the function calling convention. (See the discussions motivating #[repr(transparent)]. See also recent AMD64 ABIs that specify the calling convention for _Bool by defining it to be in the integer class.)

briansmith commented 6 years ago

I read rust-lang/rust#46176 and I understand that it was decided to not reject use of bool in FFI declarations since it would be a backward-compatible change.

In particular, my understanding is that it was decided to let people assume bool is compatible with _Bool, but nowhere is it documented ("guaranteed") that bool is compatible with _Bool, especially w.r.t. the the target's function calling convention.

Gankra commented 6 years ago

I wrote this big thing detailing what I believe to be true about layouts and ABIs in rust: https://gankro.github.io/blah/rust-layouts-and-abis/

briansmith commented 6 years ago

I wrote this big thing detailing what I believe to be true about layouts and ABIs in rust: https://gankro.github.io/blah/rust-layouts-and-abis/

Thanks. That matches what I would expect.

One nit: "Here is a table of the ABIs of the core primitives in Rust, which C/C++ types they are guaranteed to be ABI compatible with,"

I'm not sure if you're saying that you already think that that statement is true (somewhere official documentation guarantees that equivalence). The problem that this issue is attempting to address is that there's isn't such a guarantee in any official documentation yet.

Gankra commented 6 years ago

We're relying on these bridgings being accurate in Firefox, as is every other project using bindgen/cbindgen. And these projects have worked closely with the Rust team to make sure we're not running afoul of anything. I agree these claims should however be formally documented in e.g. The Reference or something.

avadacatavra commented 6 years ago

should this discussion deal with all scalar types (aka should we include characters in this discussion?)

strega-nil commented 6 years ago

@avadacatavra char [C++] is equivalent to either i8/u8 (although you can use either for ABI compat); char [Rust] is not ABI compatible with anything (although I'd argue it'd be useful to be ABI compatible with char32_t) (note: that type only exists in C++).

People have argued that they shouldn't be ABI compatible, since char32_t doesn't have the correctness guarantees Rust's char does; I would argue that it's the same idea as C-like enums in Rust vs enums in C++.

briansmith commented 6 years ago

I wrote this big thing detailing what I believe to be true about layouts and ABIs in rust: https://gankro.github.io/blah/rust-layouts-and-abis/

In https://gankro.github.io/blah/rust-layouts-and-abis/#the-layoutsabis-of-builtins, it would be useful to define the ABI correspondence for function parameters x: &[T; n] and C T x[static n] and T *x. ring is one crate that depends on this correspondence.

Gankra commented 6 years ago

@briansmith I believe that is implicit in pointer ABI matching and array layout matching. I'm not aware of any system under-which the ABI of a pointer depends on the pointee's type, and array types in function parameters are just sugar for pointers.

gnzlbg commented 6 years ago

How is usize intended to be defined on various platforms?

  • the native size of a pointer?
  • the max of various other considerations?
  • other edge cases to consider?

Summarizing the discussion about usize/isize so far, we have already committed to these having the same size as a native pointer and changing that at this point would be a big breaking change.

The representation of usize determines many things, like:

etc. We should document these, but they don't change usize's representation so we don't have to document all of these things right now.

This definition would also limit the problematic platforms to those that either do not have a native pointer size (can't think of any) or those that have multiple native pointer sizes (near and far pointers in segmented architectures). I'd say it's ok to worry about them when someone tries to add support for them (for segmented archs one could pick one of the pointer types as "native" and add newer types for the rest).

gnzlbg commented 6 years ago

@sfackler

We do guarantee bool is C-compatible - see rust-lang/rust#46176 and rust-lang/rust#46156.

The merged PR specifies bool to be of size 1 which is not the same as being C-compatible [0] . The intent seems to have been to guarantee compatibility with C's _Bool type, e.g., @withoutboats called this out in the discussion here:

  • bool has the same representation as the platform's _Bool type.

  • We document this, and also document that on every platform we currently support, this means that the size of bool is 1.

and with more rationale here:

People could come to the conclusion that they need a c_bool type for their FFI to be forward compatible with platforms we don't yet support. I think defining it as the same representation as _Bool / C++ bool makes it the least likely someone does something painful to avoid entirely hypothetical problems.

We have to decide whether we want bool to have the same representation as C's _Bool type (C FFI safe), or whether we want to make bool have size 1 (C FFI unsafe?). We could also make bool have size 1 and be C FFI safe by trading out support for platforms in which _Bool does not have size 1.


[0] The MSVC2012 docs mention that MSVC <= 4.2 bool type is 4 bytes wide - link .