rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
99.05k stars 12.8k forks source link

Missed optimization: RVO isn't applied #116541

Open gootorov opened 1 year ago

gootorov commented 1 year ago

Consider the code below. At the first glace, it looks like RVO/destination propagation isn't applied, resulting in a stack overflow.

File structure

tree .

.
├── bin
│   └── rvo.rs
├── Cargo.toml
├── src
│   └── lib.rs

Code

Cargo.toml

[package]
name = "rvo"
version = "0.1.0"
edition = "2021"

[[bin]]
name = "rvo"
path = "bin/rvo.rs"

[dependencies]

[profile.release]
lto = "fat"

bin/rvo.rs

use rvo::Outer;

fn main() {
    let _foo = Box::new(Outer::new());
}

src/lib.rs

#![allow(unused)]

const SIZE: usize = 1024 * 1024 * 1024;

pub struct Outer {
    a: LargeStruct,
}

impl Outer {
    #[inline(always)]
    pub fn new() -> Self {
        let a = LargeStruct::new();

        Self { a }
    }
}

pub struct LargeStruct {
    d0: Option<Box<u128>>,
    d1: Option<Box<u128>>,
    d2: Option<Box<u128>>,
    d3: Option<Box<u128>>,
    d4: Option<Box<u128>>,
    huge: [u8; SIZE],
}

impl LargeStruct {
    #[inline(always)]
    pub fn new() -> Self {
        Self {
            d0: None,
            d1: None,
            d2: None,
            d3: None,
            d4: None,
            huge: [0; SIZE],
        }
    }
}

Running cargo run --release results in:

    Finished release [optimized] target(s) in 0.01s
     Running `target/release/rvo`

thread 'main' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)

Compiler version: rustc --version --verbose

rustc 1.73.0 (cc66ad468 2023-10-03)
binary: rustc
commit-hash: cc66ad468955717ab92600c770da8c1601a4ff33
commit-date: 2023-10-03
host: x86_64-unknown-linux-gnu
release: 1.73.0
LLVM version: 17.0.2
gootorov commented 1 year ago

Interestingly, the stack no longer overflows when just a few fields are removed:

diff --git a/src/lib.rs b/src/lib.rs
index d84519e..3d43db8 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -18,9 +18,6 @@ impl Outer {
 pub struct LargeStruct {
     d0: Option<Box<u128>>,
     d1: Option<Box<u128>>,
-    d2: Option<Box<u128>>,
-    d3: Option<Box<u128>>,
-    d4: Option<Box<u128>>,
     huge: [u8; SIZE],
 }

@@ -30,9 +27,6 @@ impl LargeStruct {
         Self {
             d0: None,
             d1: None,
-            d2: None,
-            d3: None,
-            d4: None,
             huge: [0; SIZE],
         }
     }
saethlin commented 1 year ago

At the first glace, it looks like RVO/destination propagation isn't applied

Both these MIR optimizations are off by default, and the NRVO MIR opt is unsound.

gootorov commented 1 year ago

Both these MIR optimizations are off by default, and the NRVO MIR opt is unsound.

Interesting, thank you. I was always under the impression that optimizations of the Box::new(returns_large_object()) type are RVO. Or is it (currently) just LLVM's RVO and not MIR?

Also, is there a way to turn on these MIR optimizations via a compiler flag/crate attribute?

workingjubilee commented 1 year ago

Yes, currently LLVM performs the relevant RVO (or doesn't). You can use -Zmir-enable-passes=+DestinationPropagation,+RenameReturnPlace.

Alternately, I believe most unsound opts are enabled at -Zmir-opt-level=4, but that can cause other hilarious bugs.

saethlin commented 1 year ago

Unsound optimizations must be enabled by their own flag, -Zunsound-mir-opts. Unfortunately perhaps, -Zmir-opt-level=4 is used in the ecosystem.

It's possible your request here will be fulfilled by https://github.com/rust-lang/rust/pull/116531.

gootorov commented 1 year ago

Unfortunately, if I did everything correctly, #116531 doesn't help in this case :(

I've checked-out #116531 and built stage2. Then ran the binary again:

RUSTFLAGS="-Zmir-opt-level=4 --emit=mir" cargo +stage2 run --release

Still overflows.

This is the emitted MIR:

mir ``` // WARNING: This output format is intended for human consumers only // and is subject to change without notice. Knock yourself out. const SIZE: usize = { let mut _0: usize; let mut _1: usize; let mut _2: (usize, bool); let mut _3: (usize, bool); bb0: { StorageLive(_1); _2 = CheckedMul(const 1024_usize, const 1024_usize); assert(!move (_2.1: bool), "attempt to compute `{} * {}`, which would overflow", const 1024_usize, const 1024_usize) -> [success: bb1, unwind continue]; } bb1: { _1 = move (_2.0: usize); _3 = CheckedMul(_1, const 1024_usize); assert(!move (_3.1: bool), "attempt to compute `{} * {}`, which would overflow", move _1, const 1024_usize) -> [success: bb2, unwind continue]; } bb2: { _0 = move (_3.0: usize); StorageDead(_1); return; } } fn ::new() -> Outer { let mut _0: Outer; let _1: LargeStruct; scope 1 { debug a => _1; } scope 2 (inlined LargeStruct::new) { let mut _2: [u8; 1073741824]; } bb0: { StorageLive(_2); _2 = [const 0_u8; 1073741824]; _1 = LargeStruct { d0: const Option::>::None, d1: const Option::>::None, d2: const Option::>::None, d3: const Option::>::None, d4: const Option::>::None, huge: move _2 }; StorageDead(_2); _0 = Outer { a: move _1 }; return; } } LargeStruct::huge::{constant#0}: usize = { let mut _0: usize; bb0: { _0 = const _; return; } } fn ::new() -> LargeStruct { let mut _0: LargeStruct; let mut _1: [u8; 1073741824]; bb0: { StorageLive(_1); _1 = [const 0_u8; 1073741824]; _0 = LargeStruct { d0: const Option::>::None, d1: const Option::>::None, d2: const Option::>::None, d3: const Option::>::None, d4: const Option::>::None, huge: move _1 }; StorageDead(_1); return; } } ::new::{constant#0}: usize = { let mut _0: usize; bb0: { _0 = const _; return; } } ``` ``` // WARNING: This output format is intended for human consumers only // and is subject to change without notice. Knock yourself out. fn main() -> () { let mut _0: (); let _1: std::boxed::Box; let mut _2: rvo::Outer; scope 1 { debug _foo => _1; } scope 2 (inlined Outer::new) { let _3: rvo::LargeStruct; scope 3 { debug a => _3; } scope 4 (inlined LargeStruct::new) { let mut _4: [u8; 1073741824]; } } bb0: { StorageLive(_1); StorageLive(_2); StorageLive(_3); StorageLive(_4); _4 = [const 0_u8; 1073741824]; _3 = LargeStruct { d0: const Option::>::None, d1: const Option::>::None, d2: const Option::>::None, d3: const Option::>::None, d4: const Option::>::None, huge: move _4 }; StorageDead(_4); _2 = Outer { a: move _3 }; StorageDead(_3); _1 = Box::::new(move _2) -> [return: bb1, unwind continue]; } bb1: { StorageDead(_2); drop(_1) -> [return: bb2, unwind continue]; } bb2: { StorageDead(_1); return; } } ```

Also, just in case, I tried perhaps less aggressive

RUSTFLAGS="-Zmir-enable-passes=+DestinationPropagation,+RenameReturnPlace --emit=mir" cargo +stage2 run --release

and there were no difference in the emitted MIR

workingjubilee commented 1 year ago

Did you use -Zunsound-mir-opts as well?

gootorov commented 1 year ago

I thought that wasn't necessary judging by the diff in #116531, but to be sure, I've just tried these two variants:

RUSTFLAGS="-Zunsound-mir-opts -Zmir-enable-passes=+DestinationPropagation,+RenameReturnPlace --emit=mir" cargo +stage2 run --release
RUSTFLAGS="-Zunsound-mir-opts -Zmir-opt-level=4 --emit=mir" cargo +stage2 run --release

Unfortunately, same outcome, and the emitted MIR is also the same as before.

gootorov commented 1 year ago

Actually, I think I'm doing something wrong. Just tried comparing this MIR to stable MIR (RUSTFLAGS="--emit=mir" cargo run --release) and there also isn't any difference

saethlin commented 1 year ago

--release is about equivalent to -Zmir-opt-level=2. You're doing things fine, the MIR optimization as-written simply doesn't help for this specific case.