How do aliasing model protectors interact with tail calls?

rust-lang / unsafe-code-guidelines

Forum for discussion about what unsafe code can and can't do

https://rust-lang.github.io/unsafe-code-guidelines

Apache License 2.0

650 stars 57 forks source link

How do aliasing model protectors interact with tail calls? #523

Open RalfJung opened 1 month ago

RalfJung commented 1 month ago

In code like this

fn test1(x: &mut i32) {
  become test2(x as *mut i32);
}

fn test2(x: *mut i32) { ... }

x is protected inside test1. Is it also still protected while test2 runs?

If we want become to behave mostly like return, the answer should be "yes". However, test1's stack frame is guaranteed to not exist any more when test2 runs, so maybe it would make sense to say that the protectors associated with that stack frame have also expired?

The answer may be decided for us by LLVM, depending on what they say about the scope of noalias and dereferenceable in a function with a tail call: is the tail call inside or outside the scope of noalias and dereferenceable of a function argument?

comex commented 1 month ago

Assuming that become will turn into LLVM musttail, my intuition is that it wouldn't have much effect on optimizations, so the noalias would still be in scope for the tail-callee.

...Well, grepping the LLVM source for musttail, there are more references than I thought, mostly in order to avoid breaking the musttail invariants in various passes…

But it doesn't seem to affect aliasing.

Here is an example program demonstrating the need for protectors. Since rustc doesn’t seem to be able to generate LLVM IR for become, here is the same program transliterated to C, using [[clang::musttail]] in place of become, and we can see that LLVM does perform the optimization that’s enabled by noalias.

RalfJung commented 1 month ago

@comex so in that example, test2 gets inlined and then the noalias on test1 kicks in and moves the load down across the store?

Something is odd about this though, the implicit reborrow of x when it gets passed to test1 should constitute a write that invalidates y. Unfortunately I can't figure out how to even generate MIR for this code, it always ICEs in codegen which is odd since --emit=mir shouldn't codegen anything, I thought? @WaffleLapkin how do I dump the MIR when become is involved...? EDIT: --emit=metadata -Zdump-mir seems to work.

RalfJung commented 1 month ago

Okay the other UB doesn't happen because of a surprising two-phase borrows interaction.

So getting back to @comex's example: test2::x is considered derived from y here, so using first x and then y is fine, but swapping them is not. This indeed needs protectors. Nice example, thanks!

comex commented 1 month ago

Heh. I had no idea that two-phase borrows were involved; I was just fiddling with the code until the version with become commented out produced the right Miri error.

RalfJung commented 1 month ago

FWIW here is a variant that does not rely on two-phase borrows.

#![allow(incomplete_features)]
#![feature(explicit_tail_calls)]
use std::cell::UnsafeCell;

fn test2(x: *mut i32, y: *mut i32, loop_count: i32) -> i32 {
    unsafe {
        let mut val = 0;
        for _ in 0..=loop_count {
            // This load gets moved down:
            val = *x;
            // Conflicting store:
            *y = 200;
        }
        val
    }
}

#[inline(never)]
#[no_mangle]
fn test1(x: &mut i32, y: *mut i32, loop_count: i32) -> i32 {
  // Optionally comment out `become` here.
  become test2(x as *mut i32, y, loop_count)
}

fn main() {
    let x = &mut 100;
    let y = x as *mut _;
    let ret = test1(unsafe { &mut *y }, y, 0);
    assert_eq!(ret, 100);
}