Open RalfJung opened 5 years ago
But this means that e.g. for
(u8, u16)
, which has size 4, it will transmute these pairs into u32 and compare them as integers.
If you don't mind me asking, is the transmute
itself UB as well? In particular, I had thought that the in-memory representation of tuples isn't guaranteed? Or is it just the ordering of fields that isn't guaranteed?
Well, the code checks that the sizes match. Ignoring complicated issues around pointers, the only way in which transmuting a 4-byte piece of data to u32
is UB is if there are bit patterns that are not valid for u32
-- which boils down to the question of whether uninitialized bytes are allowed in u32
.
It seems C/C++ standard committee members tend to agree that all bit patterns are valid for u32
and other unsigned integer types: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r4.html "For unsigned integer types, the bits of the object representation are ..." I think it's reasonable, because (once we agree to use two's complement) there's no other sane choice.
@RalfJung That sounds overly restrictive to me.
IMO, it should be possible to transmute a value of any type T
into [u8; size_of::<T>()]
, read those bytes, and transmute back into T
without UB. I would bet a lot of C and Rust code has been written under that assumption.
@stjepang To be very pedantic (sorry!), u8
is special in that everything can be legally cast to an array of u8
in Rust or char
in C/C++. But u32
is not special in that sense, at least according to the C/C++ standard :(
And yet I agree with your sentiment: I believe representing an object of any type as u32
or any other unsigned integer types should not be UB. Edit: in other words, u8
should not be special at the first place.
I am not aware of any precedent for u8 being special in Rust, and in fact all the discussion of the invariants of integer types I'm aware of treats all integer types alike. It's true that in C and C++, char
and its signed/unsigned variants are very special, but that is precisely one of the thorny parts of writing unsafe code in those languages that Rust should try to smooth out.
Thanks for the clarification! I agree with you and would expect u8
to not get special treatment over other integer types. It's all just bytes to me.
This is Rust, not C++. :)
"For unsigned integer types, the bits of the object representation are ..." I think it's reasonable, because (once we agree to use two's complement) there's no other sane choice.
Remember that every bit can be 0, 1 or uninitialized. Even the C++ standards committee acknowledges this, though just indirectly -- namely, it acknowledges that "indeterminate values" can be "unstable" in the sense that x == x
on an integer can be false. This can only be the case if there are states beyond 0 and 1.
I agree every sequence of 0 and 1 is fine in u32
. But for uninitialized, it is far from reasonable to allow such bits in a u32
. (And indeed I picked u32
as an arbitrary example here, the rules should be the same for all integer types.)
If we allow that, we have to answer other hard questions like what happens when you do arithmetic or comparison involving uninitialized bits. That's why I think that any uninitialized bits (outside of padding) should only exist in a MaybeUninit
-- that is, a [MaybeUninit<u8>; N]
can represent any data, but a [u8; N]
cannot.
There is a conflict between permitting arithmetic and other operations on a type, and permitting it to hold "strange bits" (uninitialized bits, but also transmuted pointers can be a problem here) -- and since u8
permits arithmetic, it should not permit "strange bits".
It's all just bytes to me.
@RalfJung Fair enough, this is difficult. :(
It might be helpful to look at what C/C++ do on atomic compare-exchange with arbitrary (non-POD) data. This is interesting:
The comparison and copying are bitwise (similar to std::memcmp and std::memcpy); no constructor, assignment operator, or comparison operator are used.
When a weak compare-and-exchange would require a loop and a strong one would not, the strong one is preferable unless the object representation of T may include padding bits, (until C++20) trap bits, or offers multiple object representations for the same value (e.g. floating-point NaN). In those cases, weak compare-and-exchange typically works because it quickly converges on some stable object representation.
For a union with bits that participate in the value representations of some members but not the others, compare-and-exchange might always fail because such padding bits have indeterminate values when they do not participate in the value representation of the active member.
Padding bits that never participate in an object's value representation are ignored.
Source: https://en.cppreference.com/w/cpp/atomic/atomic/compare_exchange
Boost atomics will do reinterpret_cast
just like AtomicCell
(source). However, on MSVC-8 and MSVC-9, spinlocks are used because otherwise the compiler would miscompile code (source).
Yeah, I figured this would be related to memcmp
but unfortunately I was not able to find anything precise about memcmp
on padding bytes. We have two kinds of problems here:
Is the LLVM IR we currently generate for this valid, or does it have UB? The atomic comparison and byte_eq
are working with poisoned data, so I am not sure. We could argue that atomic reads are frozen (which is an assumption AtomicCell
already makes for volatile reads), which would help for the atomic comparisons but not for the fallback case (the data compared by byte_eq
is read with a normal, non-atomic non-volatile load). However, the C++ spec you are quoting makes it sound like it is not at all clear that atomic reads are frozen...
Beyond the LLVM rules, are there extra Rust rules this violates? This is where the part about having uninitialized bits in a u32
comes in. I'd prefer if this was all MaybeUninit<u32>
but of course we don't have atomic operations on those...
@Amanieu @stjepang with the newly proposed ptr::freeze
and if you remove AtomicCell::get_mut
, I think this can be made sound. You basically just have to make sure that you only ever write frozen stuff into the cell, and freeze things before you compare them. In contrast to the "zero padding"-based schemes @Amanieu suggested at the all-hands, this does not require a special function that knows where the padding bytes are.
That doesn't exactly solve the problem though: compare_exchange
returns a T
with the current value in the atomic if the compare fails. In the most common use case (CAS loop) this will be passed back in as the current
argument to compare_exchange
in the next loop iteration.
However since the padding bytes in T
are always undef, the freeze
operation could freeze them into a different value than the one in the atomic, which would cause the next compare_exchange
call to fail even though the value in the atomic was not changed since the last iteration. This could result in the CAS loop never converging to a stable state.
@RalfJung Awesome news!
@Amanieu Are you sure? We have this if statement that should solve the issue you're describing. Note that compare_exchange
will never fail spuriously due to the difference in undef bytes.
@stjepang Ah right. So you won't get spurious failures, but you could in theory get stuck in an infinite loop.
@Amanieu Hmm, not sure I understand how. If we do cell.compare_exchange(old, new)
and we know cell.get() == old
, the operation will always succeed, even if the two values differ in padding bytes.
Could you perhaps sketch out a simple example? I don't see the theoretical infinite loop here.
Let us assume some 2-byte type (u8, __pad: u8)
. The first byte contains a valid value, the second byte is padding. The underlying atomic type used is AtomicU16
.
AtomicU16
is (7, 0)
.cell.compare_exchange(old = (7, undef), new = (8, undef))
.u16
s with values: old = (7, 60), new = (8, 90)
.atomic_compare_exchange_weak
fails because the padding bytes don't match.atomic_compare_exchange_weak
returns Err((7, undef))
because it returns a T
instead of a u16
, the value is "unfrozen"compare_exchange
loops back to step 2 with old = (7, undef)
.Now that I look at it this way, I think this is fixable if you keep the previous value as a u16
when passing it back up the loop instead of converting it back to a T
.
Thanks! I agree about keeping the previous value as a u16
rather than going u16
-> T
-> u16
. That should solve the problem.
Ah good point, yes you need to do the entire loop without round-tripping through T
.
It is a bit dissatisfying that we cannot also have get_mut
, though... all we need to have that, too, is an "atomic freeze operation" -- a way to freeze memory as the first thing compare_exchange
does, in a way that does not count as a data race in the memory model. That seems like a pretty powerful operation though, and I have no idea how to integrate it with the rest of the memory model.
Pragmatically speaking, an asm!
block probably has this effect as far as LLVM is concerned. But that's not really good enough for me.^^
Any thoughts on replacing get_mut
with store_mut
/load_mut
which, IIUC, could still be implemented soundly?
These could freeze, so yes, they could be implemented in a sound way in this model.
I believe (1) Rust should not blame padding bytes as invalidate
values as well, and (2) we should de-deprecate AtomicCell::get_mut()
. This idea is not just popping from my head, but is actually inherited from the precedence of C/C++.
Paddings are not invalidate
but unspecified
in C/C++. That means (1) you don't know what's the value of the paddings, but (2) computations on the paddings should not invoke undefined behavior. It is for a good reason. For example, padding bytes that do not introduce undefined behavior is essential in our implementation of AtomicCell::get_mut()
.
AFAIK padding bytes are indeterminate in C/C++. That's the same status as uninitialized memory, and it is very different from unspecified. Whether or not computations on indeterminate values are UB is not clear from the standard, but:
In the case of an indeterminate value all bit-patterns are valid representations and the actual bit-pattern may change without direct action of the program.
Source: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1793.pdf
In particular, comparing indeterminate values with anything is meaningless, because they do not have a stable bit pattern. This is a problem for compare_exchange
, which performs exactly this kind of comparison.
Also, another point not clear from the standard is what happens when you memcpy
a 4-byte struct including a padding byte into a uint32_t
-- does the entire integer value become indeterminate, or is this somehow tracked on a per-byte basis? From what I recall from the LLVM poison
discussion, there are optimizations that are incompatible with either choice.
According to C17 6.2.6p6, "padding bytes take unspecified values":
When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.51) The value of a structure or union object is never a trap representation, even though the value of a member of the structure or union object may be a trap representation.
Interesting. However, we also have C17 6.7.1 p9:
unnamed members of objects of structure and union type do not participate in initialization. Unnamed members of structure objects have indeterminate value even after initialization.
and 6.7.2.1 p15 + p17:
There may be unnamed padding within a structure object, but not at its beginning.
There may be unnamed padding at the end of a structure or union.
Also, to my knowledge, making padding indeterminate is needed to make SROA (scalar replacement of aggregates) a valid optimization.
Also note footnote 51 in your quote is
51) Thus, for example, structure assignment need not copy any padding bits.
But that doesn't square with just being unspecified: if I have an uninitialized variable and initialize it by assignment, and the assignment does not copy the padding bits, then those bits will remain indeterminate.
For unnamed members/paddings, I think the standard intentionally distinguished members and paddings: paddings are not members, and unnamed members are indeterminate and unnamed paddings are unspecified. Also, I think the footnote 51 should be interpreted as informative, but not normative. It probably suggests an implementation strategy.
I think SROA will be much more interesting than language laws. Would you please give me a reference or example why paddings should be indeterminate for validating SROA?
I think that's just an artifact of wording -- padding is a kind of unnamed member. What other unnamed members would there even be?
Would you please give me a reference or example why paddings should be indeterminate for validating SROA?
Well, one effect of SROA is that only the elements of a struct get copied, not its padding -- exactly what the footnote alludes to.
struct Foo { uint8_t f1, uint16_t f2 };
Foo bar(Foo foo) {
Foo foo2 = foo; // assume this one gets SROA'd
return foo2;
}
Foo bar2(Foo foo) {
Foo foo2;
foo2.f1 = foo.f1;
foo2.f2 = foo.f2;
return foo2;
}
bar
is basically transformed to bar2
: it will return a Foo
where the two fields got copied from the argument, but the padding was left uninitialized -- and hence is indeterminate.
Unnamed fields are anonymous ones: https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html I still think paddings are not unnamed members mentioned in the standard.
I believe paddings are never indeterminate in C, so your SROA example is sound even though the padding bytes are unspecified values.
Unnamed fields are anonymous ones: gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html
How would it make any sense that those members do not get initialized?
I still think paddings are not unnamed members mentioned in the standard.
Given that padding is explicitly called out as "unnamed", I disagree with that reading.
Notice that this interpretation is not just mine, I took it from https://wiki.sei.cmu.edu/confluence/display/c/EXP42-C.+Do+not+compare+padding+data which explicitly quotes the above parts of the standard when talking about padding.
I believe paddings are never indeterminate in C, so your SROA example is sound even though the padding bytes are unspecified values.
I don't follow. Uninitialized variables are indeterminate. Why would their padding not be...?
The standard is generally extremely vague about all this indeterminate/unspecified/trap representation business. I think it is much clearer to follow the approach proposed for LLVM, where "indeterminate" becomes poison
and there's a proper formal definition. Unintiailized memory is poison
. The only way to explain my example above is to say that copying a struct resets all padding to poison
.
On "unnamed":
I think distinguishing unnamed members and unnamed paddings, is the only way to make sense of C17 6.2.6p6 and [C17 6.7.1 p9 and 6.7.2.1 p15 + p17] at the same time. Even if they conflict with each other, 6.2.6p6 is more clearly describing the semantics of padding with unambiguous words: padding bytes are unspecified. I prefer to follow it.
On uninitialized struct/union values:
According to C18, struct and union values, even if uninitialized, should not be a trap representation. (Here, "trap representation" means what we called "indeterminate" in this discussion: C18 3.18.2 says an indeterminate value is "either an unspecified value or a trap representation".) C18 6.2.6p6 says: "The value of a structure or union object is never a trap representation, even though the value of a member of the structure or union object may be a trap representation."
On "unnamed"
I see where you are coming from. OTOH, I feel if the people doing the CERT C Coding Standard interpret it the way I quoted, that probably means they are not alone. So at this point it probably stops being useful to stare at the vague C standard.
On uninitialized struct/union values:
Yeah, unfortunately the entire "trap representation" stuff in C/C++ is really poorly specified. For example, integers also don't have "trap representations" (I believe the next standard will even say that), and yet in the mainstream C compilers there are "weird" integers like undef
and poison
which definitely cannot be treated like "normal" values in the C standard (for example, x == x
may return false
and x == 0 ? 0 : y/x
is UB because x
might be non-zero the first time you look at it and 0 the second time).
So it is probably more useful to look at what compilers actually do. The only one I am familiar with and the most relevant for Rust is LLVM. LLVM trunk is inconsistent in what it does, but there is a proposal for a consistent model. In that model, structs and struct fields can definitely be poison
. The part about structs not having trap representations likely stems from structs not being stored in registers, and registers with a constrained value range being the source of these "trap representations". But nowadays this is used in a whole different way. So for the purpose of this discussion I think we can safely ignore any part of the C standard that says certain things cannot have "trap representations"; compilers have poison
/undef
for all sorts of values.
So, in the poison
model, I claim that reading padding gives back poison
. In current LLVM, with both poison
and undef
, padding is undef
. Even though I have been told this several times, I cannot find a good reference for it. The aforementioned paper says
LLVM uses undef in a few other situations. First, to represent the values of padding in structures, arrays, and bitfields.
Similar things are said in this bugreport. @rkruppe do you know if the LLVM IR specifies this padding stuff properly anywhere?
Since a mere read access to some location in memory cannot even "know" if it is reading padding or whatever, the only way I see to realize "reading padding gives undef
/poison
" is to say that copying a struct (at struct type) will reset the padding to undef
/poison
. But that doesn't even matter that much, if seems quite clear that with get_mut
we can make the compare_exchange
code read padding bytes, and hence (according to what everyone seems to agree LLVM does), reading poison
/undef
and use them in computations.
Also see this where more people agree with me that inspecting padding is UB in modern compilers (they say that violates the standard, which I am not sore sure about, but given the ambiguity of the standard that does not seem like a useful discussion).
I think the difference is on the viewpoint. I, as a systems programmer, think that get_mut()
is sane, and it should be supported in any systems programming languages. From the viewpoint, I try to find evidence for allowing reading of uninitialized paddings, such as paragraphs in the standards and implementations in compilers. On the other hand, you (and obviously I to a certain extent), as programming language researchers, think that there should be a good programming model that explains the real-world practice. Unfortunately, this conflict doesn't seem to be resolved anytime soon :( But for the purpose of library building, I prefer to take the viewpoint of systems programmers.
To summarize my viewpoint, (1) get_mut()
is sane from a systems programmer's viewpoint, and (2) if get_mut()
is currently not supported, it's the programming language's fault, and furthermore, (3) we should revise the undef/poison model (or anything else) so as to explain get_mut()
, unless we could find a fundamental reason why it's undesirable to do so.
@stjepang For this reason, I'd like to propose to revert #332 (Deprecating AtomicCell::get_mut).
I, as a systems programmer, think that get_mut() is sane, and it should be supported in any systems programming languages.
I would like to support it! However, the way we ended up in this messy situation with C/C++ is by ignoring the fact that a language must have a single consistent semantics, and relying on mutually incompatible axioms and intuition instead.
In my view, your argument is flawed. From a systems programmer's viewpoint, one could argue that all sorts of things are "sane" -- things like benign data races or OOB loads on a definitely mapped page or implementing small atomic accesses with bigger accesses. The issue is that these arguments mix up intuition formed by looking at "what the hardware does", with "what the programming language is supposed to do". This is not a sensible way to argue, and it leads to miscompiled programs quickly.
What you are doing here is closing your eyes in the face of the problem and hoping that that makes it go away. It won't go away. It's not just "the language's fault"---we can make get_mut
sound if we disable a bunch of optimizations. What we don't know is how to have both get_mut
and these optimizations. Finding a way to reconcile this conflict is of course our responsibility as language designers, but that does not mean that we can just axiomatize it away. It is not uncommon that having two things that both seem intuitive is not actually possible at the same time (as you are well aware of course). We can either follow the C approach of "YOLO whatever, it won't be too bad" (leading to mysteriously miscompiled programs, long debugging sessions and the entire rest of the mess that is called "C semantics"), or we can acknowledge that when formal semantics are in conflict with systems programming intuition, sometimes there's an actual problem here that we cannot just ignore away.
So, in an attempt to make this conversation more constructive, what kind of feature would we need in the language to make get_mut
compatible with compare_exchange
? The soundness argument for compare_exchange
relies on the invariant that the content of the AtomicCell
is frozen at all times. get_mut
allows that invariant to be violated, so we cannot rely on memory to be frozen. Hence we must freeze it ourselves. It seems to me what we would need is an atomic version of ptr::freeze
, which we can use to freeze the memory when compare_exchange
begins. We also have to make sure that racing writes do not write unfrozen data, but we control that code so that is easy.
So, we could have
fn store(&self, mut val: T) {
ptr::freeze(&mut val);
// do the atomic store, *at a type that preserves padding*
}
fn compare_exchange(&self, mut val1: T, mut val2: T) {
ptr::atomic_freeze(self.value.get());
ptr::freeze(&mut val1);
ptr::freeze(&mut val2);
// do the CAS loop, *at a type that preserves padding*
}
@Amanieu does that make sense to you? I am mostly worried about what it means that we now might perform a "write" operation even if the CAS failed. Alternatively, we could specify that CAS freezes the memory before comparing, but that seems extremely risky given that we have no idea which optimizations LLVM will perform on this.
I think an argument could be made that compare_exchange
should effectively do the same thing as a volatile load (i.e. it reads a frozen value from the AtomicCell
).
In the end I think we should just do whatever C++ does (where get_mut
is effectively the inverse of atomic_ref
). The LLVM people will figure out the semantics for padding bytes and we should just copy whatever they do. This means that we should probably support something similar to get_mut
.
I think an argument could be made that compare_exchange should effectively do the same thing as a volatile load (i.e. it reads a frozen value from the AtomicCell).
I was told that LLVM is looking into transformations that will replace atomic operations by non-atomic operations when the compiler can prove that there are no race conditions. That however means that we cannot say that atomic operations read frozen values.
@RalfJung I still think my previous comment quite clearly summarizes what the standard says about padding (they are just random bits and safe to access), and I think actually it's quite good semantics for everyone. As far as I understand, existing optimizations, as well as system programmer's idioms, are supported. In particular, get_mut()
is sound. And it's the status quo.
So to say get_mut()
should be unsound, I expect there should be an optimization that renders wrong in the presence of get_mut()
, or paddings being safe. Just the existence of a language model (e.g. the freeze model) is, at least for me, not as convincing as a counter-example.
I still think my previous comment quite clearly summarizes what the standard says about padding (they are just random bits and safe to access), and I think actually it's quite good semantics for everyone.
Well, it's not how LLVM implements padding, so it's clearly not good for everyone, and certainly not good for Rust which uses LLVM as backend. I have heard your argument, there is no point in repeating it. I just don't agree, this is not how I interpret the C standard. I won't repeat my arguments now as you already know them.
And anyway the C standard is only relevant insofar as we might use it to deduce things about the LLVM semantics. And there I can easily craft examples that show that padding can indeed turn into undef
any time:
pub fn test(x: (u32, u64), f: fn(&(u32, u64))) {
let y = x;
f(&y);
}
gets compiled to
define void @_ZN10playground4test17hc74ff9e76a3b675aE(i32 %x.0, i64 %x.1, void ({ i32, i64 }*)* nocapture nonnull %f) unnamed_addr #0 {
start:
%y = alloca { i32, i64 }, align 8
%0 = bitcast { i32, i64 }* %y to i8*
call void @llvm.lifetime.start.p0i8(i64 16, i8* nonnull %0)
%1 = getelementptr inbounds { i32, i64 }, { i32, i64 }* %y, i64 0, i32 0
store i32 %x.0, i32* %1, align 8
%2 = getelementptr inbounds { i32, i64 }, { i32, i64 }* %y, i64 0, i32 1
store i64 %x.1, i64* %2, align 8
call void %f({ i32, i64 }* noalias nonnull readonly align 8 dereferenceable(16) %y)
call void @llvm.lifetime.end.p0i8(i64 16, i8* nonnull %0)
ret void
}
As you can see, this creates a local allocation %y
, then copies the two fields into y
, and then passes %y
to f
. In particular, the padding in y
does not get initialized, and hence is undef
.
This can only be explained in Rust by saying that the let y = x
copies the fields of x
but resets the padding to undef
.
I think this conclusively proves that in the LLVM model, get_mut
will let undef
creep into Atomic<(u32, u64)>
. And since a conditional jump on the result of comparing undef
is UB, this means that get_mut
is UB unless we have some way to "atomically freeze" memory before compare_exchange
.
So to say get_mut() should be unsound, I expect there should be an optimization that renders wrong in the presence of get_mut(), or paddings being safe. Just the existence of a language model (e.g. the freeze model) is, at least for me, not as convincing as a counter-example.
So your definition of "soundness" is "you can't give a counterexample"? I am afraid that is not a definition I can accept. You can't make a function "sound" by messing with it until the current compiler compiles it the way you want. I am honestly shocked to see you even propose this.
I think there is a wide consensus in the Rust community that "soundness" has nothing to do with compiler optimizations. It is about whether your program has UB according to the spec. The spec doesn't exist yet but we can try to infer what pieces of it have to be, like I did in the beginning of this post. You don't get to redefine this term.
Or maybe you gave up on soundness, and only care about this function doing the right thing de facto. That's a fair position, albeit not one I share. What is an interesting question is constructing a sequence of reasonable-looking optimizations that break Atomic<T>
in an observable way. However, I am afraid that game is not very fun to play. I think I did the key step above by demonstrating how to put an Undef
into an initialized (u32, u64)
in safe code.
Here's another example demonstrating that the padding of any struct has to be considered undef
, no matter how "thoroughly" it got initialized or frozen:
#![crate_type="cdylib"]
#[repr(C)]
pub struct X {
a: u32,
b: u64
}
pub fn foo(x: X) -> u32 {
unsafe {
return std::ptr::read((&x as *const X as *const u32).add(1));
}
}
becomes
define i32 @_ZN10playground3foo17hdddbf0857401c2c8E(i32, i64) unnamed_addr #0 {
start:
ret i32 undef
}
(Thanks to @nagisa.)
People reading this might also be interested in the atomig
crate. Also see this discussion. Basically that is an entirely safe crate providing an Atomic<T>
type, but only where T: Atom
. There's no lock-based fallback; the trait just is not implemented for types that are too big. Because the crate is entirely safe it cannot have get_mut
, but it does provide all the other operations.
If Rust decides to guarantee that padding will always be 0
(in repr(Rust)
), then this whole problem goes away for such structs.
That issue is about adding the ability for types to opt-in to guaranteed-0 padding.
There is no proposal on the table that would make all padding always 0, and I doubt people would like that idea.
The first line of that issue description is "one option to gain more layout optimizations would be (either for all repr(Rust) types or with per-type opt-in) to use a different kind of padding."
But I agree that it might be hard to do it for all repr(Rust)
because of backward compatibility concerns. Thinking of unsafe code initializing a struct field-by-field using pointers.
AtomicCell<T>::compare_exchange
will, if I understand the code correctly, useAtomucU*::compare_exchange_weak
ifT
has the right size. But this means that e.g. for(u8, u16)
, which has size 4, it will transmute these pairs intou32
and compare them as integers. However, such pairs contain a padding byte, meaning that theu32
contains some uninitialized memory. Having uninitialized memory in au32
is uncharted territory at best, but performing actual operations on such a datum (as opposed to just load/store) brings us very close to UB. I am not sure what the rules are in LLVM for this case, but under the proposedpoison
semantics I think the comparison will returnpoison
and hence using it in anif
is UB. For Rust, I think the intention is to make any operation on uninitialized data UB (like it is in C), meaning the comparison would already return UB.Strictly speaking, this affects not just the fast path, but also the arbitrarily-sized case:
bytes_eq
might compare uninitialized bytes.