rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.96k stars 12.68k forks source link

Which functions are "reachable", and therefore subject to monomorphization-time checks, is optimization-dependent #122814

Open RalfJung opened 7 months ago

RalfJung commented 7 months ago

This is a variant of https://github.com/rust-lang/rust/issues/107503, but with a different underlying cause and hence not fixed by https://github.com/rust-lang/rust/pull/122568. @tmiasko provided this magnificent example (comments by me, they may be wrong):

//! This used to fail in optimized builds but pass in unoptimized builds. The reason is that in
//! optimized builds, `f` gets marked as cross-crate-inlineable, so the functions it calls become
//! reachable, and therefore `g` becomes a collection root. But in unoptimized builds, `g` is no
//! root, and the call to `g` disappears in an early `SimplifyCfg` before "mentioned items" are
//! gathered, so we never reach `g`.
#![crate_type = "lib"]

struct Fail<T>(T);
impl<T> Fail<T> {
    const C: () = panic!(); //~ERROR: evaluation of `Fail::<i32>::C` failed
}

pub fn f() {
    loop {}; g()
}

#[inline(never)]
fn g() {
    h::<i32>()
}

// Make sure we only use the faulty const in a generic function, or
// else it gets evaluated by some MIR pass.
fn h<T>() {
    Fail::<T>::C;
}

The symptom here is the opposite of https://github.com/rust-lang/rust/issues/107503: cargo build succeeds but cargo build --release fails. This is because more things become roots in optimized builds and therefore we evaluate more things.

I can think of two ways of fixing this:

tmiasko commented 7 months ago

The second approach is insufficient to resolve this issue.

Consider the example below or anything else that takes advantage of the fact that reachability is a static conservative over-approximation of the actual mono item collection.

#![crate_type = "lib"]
#![feature(inline_const)]

pub fn f() -> fn() {
    loop {}; const { g::<()>() }
}

const fn g<T>() -> fn() {
    match std::mem::size_of::<T>() {
        0 => ga as fn(),
        _ => gb as fn(),
    }
}

#[inline(never)]
fn ga() {
}

#[inline(never)]
fn gb() {
    if false { const { 1 / 0 }; }
}
RalfJung commented 7 months ago

Ah, I see. We'd have to walk the mentioned items of constants we encounter as well, then.

Can a similar example be constructed based on all trait impl fn being considered reachable?

tmiasko commented 7 months ago

All trait impls being reachable seems like a non-issue, because they are only used to seed an initial set of items. At this stage of computation there is no optimization level dependency yet.

RalfJung commented 7 months ago

@tmiasko can you turn your example above into a testcase that actually demonstrates opt-dependent behavior? I tried and failed.

tmiasko commented 7 months ago

Sure, I edited the comment by adding an erroneous constant to gb (it is behind and if false { ... } as otherwise it would be eagerly evaluated by KnownPanicsLint).

RalfJung commented 7 months ago

Thanks!

I played around with this a bit, here's a variant that avoids if false and function pointers:

#![crate_type = "lib"]
#![feature(inline_const)]

// Will be `inline` with optimizations, so then `g::<()>` becomes reachable.
// At the same time `g` is not "mentioned" in `f` since it never appears in its MIR!
// This fundamentally is an issue caused by reachability walking the HIR and collection being based on MIR.
pub fn f() {
    loop {}; const { g::<()>() }
}

// When this comes reachable (for any `T`), so does `hidden_root`.
// And `gb` is monomorphic so it can become a root!
const fn g<T>() {
    match std::mem::size_of::<T>() {
        0 => (),
        _ => hidden_root(),
    }
}

#[inline(never)]
const fn hidden_root() {
    fail::<()>();
}

#[inline(never)]
const fn fail<T>() {
    // Hiding this in a generic fn so that it doesn't get evaluated by
    // MIR passes.
    const { panic!(); }
}
RalfJung commented 7 months ago

I think the second approach can still be salvaged. The key property we need is that everything that reachable_set calculation walks, must be "mentioned". This means we have to add the things inside const blocks and closures to the "mentioned items" of the MIR body of the outer function. I have no idea if that's a good plan though.

EDIT: It's probably a bad plan as it makes it harder to do things like https://github.com/rust-lang/rust/issues/122301, which rely on not doing "mentioned items" things with const.

RalfJung commented 6 months ago

So the status here is that I have a draft of the first approach described in the OP in https://github.com/rust-lang/rust/pull/122862 but perf is not looking great. Not terrible, either, but a 4% regression for some primary benchmarks is unfortunate. The solution is likely to find a way to encode mentioned_items and used_items separately from the rest of MIR in the crate metadata, so that we can load them without having to load the entire MIR. I have zero knowledge about the crate metadata handling and I'm unlikely to have the time to learn about it any time soon -- so if anyone wants to pick this up, please feel free to do so.

RalfJung commented 6 months ago

@oli-obk tried a refactor that would help the first approach in https://github.com/rust-lang/rust/pull/123488, but that PR is also closed now. See here for my comment on what would probably be needed to avoid the perf issues in that PR.