Missing optimization for interger `modulo` operation in `loop` edge case

rust-lang / rust

Empowering everyone to build reliable and efficient software.

https://www.rust-lang.org

Other

96.75k stars 12.5k forks source link

Missing optimization for interger `modulo` operation in `loop` edge case #128921

Open CrazyboyQCD opened 1 month ago

CrazyboyQCD commented 1 month ago

rust and c can't optimize code like this when modulo is larger than 100 and is not power of 2 but cpp can:

fn modulo() {
    let mut j = 0;
    loop {
        j = j + 1;
        if j % modulo == 0 {
            return;
        }
    }
}

rust: godbolt link c: godbolt link cpp: godbolt link

veera-sivarajan commented 1 month ago

@rustbot label +A-codegen +C-optimization +T-compiler +I-slow -needs-triage

ryanavella commented 1 month ago

In C and C++, integer overflow is considered UB, so rather than j = j + 1; you might also consider j = j.unchecked_add(1);. Funny enough, this makes the Rust codegen worse.

Also note that if you switch from Clang to GCC, the optimizer manages to eliminate the loop in C. So this is (probably?) unrelated to the fact that C++ considers infinite loops without side effects to be UB.

CrazyboyQCD commented 1 month ago

unrelated to the fact that C++ considers infinite loops without side effects to be UB.

May be this is the reason, and rust doesn't think so, though it can be optimized if put it in a const block to force a constant evalutation or put j = j + 1 after the modulo like while and for.

I did some extra test and found it can be optimized away on certain condition and I don't known why it works.

nikic commented 1 month ago

The optimization difference in C++ is indeed due to the forward progress guarantee. SCEV just doesn't know how to compute the exit count for such a loop, and thus can't prove it finite, see https://llvm.godbolt.org/z/vMMT9b7GT.

It should be easy to support this. What is the real-world context where you encountered this issue?

ryanavella commented 4 weeks ago

The optimization difference in C++ is indeed due to the forward progress guarantee.

What then is the explanation for the optimization difference in C with GCC? (godbolt link)

From what I understand, C89 and C99 have no forward progress guarantee, and C11/C17/C23 only makes some infinite loops without side effects UB (e.g. while (1) can not be assumed to terminate)

CrazyboyQCD commented 4 weeks ago

The optimization difference in C++ is indeed due to the forward progress guarantee. SCEV just doesn't know how to compute the exit count for such a loop, and thus can't prove it finite, see https://llvm.godbolt.org/z/vMMT9b7GT.

@nikic, thanks for explaining.

What is the real-world context where you encountered this issue?

I accidentally found out when I was coding with primality test.