Pointer equality across allocations hits poorly specified parts of LLVM

mcy commented 3 years ago

While Rust defines raw pointer equality as a safe operation, it appears that this is not so in practice. Consider the following reduced example:

#[inline]
pub fn eq_inline<T>(a: *const T, b: *const T) -> bool {
    a == b
}

#[inline(never)]
pub fn eq_outline<T>(a: *const T, b: *const T) -> bool {
    a == b
}

#[inline(never)]
pub fn always_false() -> bool {
    let (a, b) = (0, 0);
    eq_inline((&a as *const i32).wrapping_add(1), &b)
}

#[inline(never)]
pub fn always_true() -> bool {
    let (a, b) = (0, 0);
    eq_outline((&a as *const i32).wrapping_add(1), &b)
}

pub fn main() {
    println!("{}, {}", always_false(), always_true())
}

(Godbolt)

On Rust 1.52.0 with rustc -O, this prints false, true; this is entirely dependent on the inlining attrs on the two eq functions. always_false is optimized to a constant return, while always_true passes identical pointers (rsp + 4, in this case) into eq_outline, which does a blind comparison.

The LLVM LangRef is unfortunately silent on this, but it appears that the root of the problem is that Rust emits the following code for pointer equality:

define ptr_eq(i32* readnone %a, i32* readnone %b) i1 {
  %eq = icmp eq i32 %a, %b
  ret i1 %eq
}

While this looks fine, Clang emits the exact same IR for comparisons of int*, and Clang is entitled to (and, in fact, makes) this optimization, because poitner comparisons across provenance domains is UB in C/C++. This makes me believe this is true UB, and not implementation-defined behavior.

In short, it is incorrect to lower pointer comparison, with Rust semantics, to icmp eq T*. I don't know of a workaround, since LLVM explicitly transmits provenance through bitcast and ptrtoint; in fact, icmp of pointers is defined as icmp of the pointers passed through ptrtoint. The solution may simply be to add some kind of optimization barrier to LLVM; who knows?

https://github.com/rust-lang/rust/issues/54685 is a related but distinct issue, since it's about comparison of function pointers, which are inherently global variables; the fix in LLVM is only relevant to globals, AFAICT.

workingjubilee commented 3 years ago

To clarify the bug, since I misunderstood it at first, if the #[inline(never)] or eq_{outline,inline} are changed, the resulting truth value of the comparison may change. I believe our current behavior may indeed be unsound.

nikic commented 3 years ago

While LangRef does not explicitly say so, it is our working assumption that pointer equality is integer equality and comparison of pointers of different provenance is not poison. Alive2 was recently updated with these semantics.

mcy commented 3 years ago

This seems to motivate that LLVM does not have these semantics today:

bool always_false(void) {
    int* p = malloc(4);
    int q;
    return p < &q;
}

(Godbolt)

Clang is fully entitled to believe that this is UB, but LLVM emits the naive address comparison. All seems well? There is probably an overlap here between alias analysis feeding back into pointer equality (i.e. pointers do not alias, so the user cannot observe them being equal) that does not extend to order comparisions. Maybe it's sufficient to believe that icmp for T* of distinct provenances is implementation-defined but not poison (freeze i1 poison if you like). Good to know this is what the Alive2 people are thinking.

It seems a bit difficult to disentangle general alias analysis from pointer equality, but such as it is, I think.

steffahn commented 3 years ago

Is there a better demonstration of actual unsoundness? In my view, always_false and always_false are totally allowed to lay out the local variables different in always_false vs always_true. Something like

#[inline(never)]
pub fn both() -> (bool, bool) {
    let (a, b) = (0, 0);
    (eq_outline((&a as *const i32).wrapping_add(1), &b), eq_inline((&a as *const i32).wrapping_add(1), &b))
}

doesn’t seem to produce inconsistent results. Where’s the actual problem?

Edit: By the way, I wasn’t able to fully understand the explanation below the example. Nonetheless, if there is any form of UB involved, it would help a lot if there was a better example that demonstrates any actually weird or problematic behavior in a way that doesn’t require understanding LLVM internals.

mcy commented 3 years ago

I don't think there's a good way to illustrate this without knowledge of LLVM internals. =( I can try to think of a more complex example, but I've only got so much free time- I'm hoping people who more actively work with LLVM directly than I do can definitively confirm whether Rust is emitting sane IR.

(Compare: Rust having to emit getelementptr rather than getelementptr inbounds for wrapping_add.)

inquisitivecrystal commented 3 years ago

@rustbot label +T-compiler

mcy commented 3 years ago

Honestly, those LLVM bugs don't make me feel any better. It's pretty clear that we're not sure if icmp folding should take advantage of provenance or not-- I couldn't tell if there was consensus anyways (in C++ land I would want this to be a thing, but not in Rust where we seem to believe provenance does not flow into comparisons).

It may be worth either closing this or punting it to UCG, since the Rust semantics are not nailed down yet anyways and this is much more a "what does LLVM do" question at this point. (It definitely seems like something to pay attention to given we have had nasty miscompilations around icmp in the past, e.g. fnptrs.)

RalfJung commented 3 years ago

While this looks fine, Clang emits the exact same IR for comparisons of int*, and Clang is entitled to (and, in fact, makes) this optimization, because poitner comparisons across provenance domains is UB in C/C++.

Which part of the C standard do you derive this from (specifically for eq)? lt across allocations is UB, but eq is always defined I think. (I am not sure about C++.)

In short, it is incorrect to lower pointer comparison, with Rust semantics, to icmp eq T*

I don't think this is correct. To my knowledge, comparing pointers in LLVM is always safe (eq, lt, all of them). There are a few bugs here, but those should be reported with LLVM.

Honestly, those LLVM bugs don't make me feel any better. It's pretty clear that we're not sure if icmp folding should take advantage of provenance or not-- I couldn't tell if there was consensus anyways (in C++ land I would want this to be a thing, but not in Rust where we seem to believe provenance does not flow into comparisons).

I think it is pretty clear that consensus is forming towards "no, provenance may not affect pointer comparison". There are LLVM transformations that assume this, and the provenance TS also mandates it. In the relevant bugreports, LLVM devs have to my knowledge never used the argument that provenance should be taken into account for pointer comparison.

In other words: yes, there are bugs, because a few of the LLVM transformations have been written before people started thinking properly about the semantics of LLVM IR. But I see no indication that LLVM actually desired to make pointer comparison anything other than what Rust needs it to be.

RalfJung commented 3 years ago

since LLVM explicitly transmits provenance through bitcast and ptrtoint

These are bugs, there are other LLVM optimizations that explicitly assume that provenance is not transmitted through bitcast and ptrtoint. Se this blog post for a more lengthy explanation.

So, overall -- I agree it is important to keep an eye on what LLVM is doing here, and maybe work with them to ensure what they do works for us, but I don't think this issue should carry the "unsound" tag.

workingjubilee commented 3 years ago

Untagged per apparent consensus that this is... not unsound, just really weird and definitely erroneous somewhere. I think this can stay open as a bug, tho', even if it's mostly defined by twiddling our thumbs and waiting for LLVM.

RalfJung commented 3 years ago

Untagged per apparent consensus that this is... not unsound, just really weird and definitely erroneous somewhere

Which concrete example given here is erroneous? The ones in the OP all seem fine.

RalfJung commented 3 years ago

@mcy could you clarify what you think is wrong about the examples your posted? They all seem fine to me.

Having different behavior between debug and release builds is completely expected. Allocation is non-deterministic, so different builds of the program (even different runs of the same build) can pick different physical addresses to actually place the allocations at.

mcy commented 3 years ago

@RalfJung I think we can close this, if you think it's a non-issue (beyond tracking that LLVM doesn't suddenly do anything wrong, which I think you're already keeping an eye on). I think my concern was around eq comparisons not being valid across provenances, since last time I had to look at this for C it turned out that the rules were, uh, not great. I think the consensus is correct: we should pay attention to whether LLVM suddenly starts doing something with icmp we don't expect.

Incidentally, when I read your blogpost last year, the impression I had was that this wasn't so much a bug in LLVM and more of poor wording in the LangRef that resulted in this emergent miscompilation between passes. In particular, these two lines under the pointer aliasing rules section concern me:

The result value of a bitcast is based on the operand of the bitcast.
A pointer value formed by an inttoptr is based on all pointer values that contribute (directly or indirectly) to the computation of the pointer’s value.

Of course, the joke is that no one actually understands the aliasing model. Fixing that is beyond the scope of this issue. =)

RalfJung commented 3 years ago

I think my concern was around eq comparisons not being valid across provenances, since last time I had to look at this for C it turned out that the rules were, uh, not great.

This is correct. But LLVM is not C. :)

I agree that it is worth keeping an eye out for any places where we rely on LLVM differing from C. But I don't think having this issue helps in that quest. Maybe we should make a dedicated list of the cases where we do that... though I don't know where to put that list such that we'll find it when needed. ;)

So I'll close this issue then. Thanks for bringing it up, being diligent in these areas is certainly important so if you notice anything similar in the future, please do let us know. :)

mcy commented 3 years ago

I mean, I would think the UCG repo is the least wrong place. Asserting stuff like "comparing pointers across provenances gives a well-defined but unspecified value" is the kind of stuff we'd want to do there, and tracking "we need LLVM to actually allow us to do this, since a C/C++ compiler wouldn't care". TBH, if something breaks spectacularly I'm sure you and eddyb will notice quite quickly. =)

RalfJung commented 3 years ago

Yeah that sounds reasonable, so I opened https://github.com/rust-lang/unsafe-code-guidelines/issues/292.

rust-lang / rust

Pointer equality across allocations hits poorly specified parts of LLVM #85968