llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.27k stars 12.1k forks source link

Missed optimization for a loop with dead code #108906

Open dcci opened 2 months ago

dcci commented 2 months ago

Godbolt link:

https://godbolt.org/z/Kaq3sbsr9

Inline code:

long patatino() {
    long x = 0;
    for (int i = 0; i < 5; ++i) {
        while (x < 10) {
            if (x % 2 == 0) {
                x += 2;
            } else {
                x += 1;
            }
            // Dead while loop
            while ((x > 20) && (i % 3 == 0) && (x % 5 == 0)) {
                x -= 5;
            }
            // Dead while loop
            while ((x < -5) && (i % 2 == 0) && (x % 3 == 0)) {
                x += 3;
            }
        }
    }
    return x;
}

If you remove the two 'dead while loops' then LLVM is able to optimize this down to a constant. For comparison, GCC trunk seems to do better, but also doesn't optimize this fully

cc: @fhahn @nikic

antoniofrighetto commented 2 months ago

Dropping the third while (x < -5) and keeping only (x > 20) as condition of the second while, partially reduced to:

define noundef range(i64 0, -9223372036854775808) i64 @patatino() {
entry:
  br label %for.cond

for.cond:                                         ; preds = %for.inc, %entry
  %x.0 = phi i64 [ 0, %entry ], [ %x.1, %for.inc ]
  %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ]
  %cmp = icmp ult i32 %i.0, 5
  br i1 %cmp, label %while.cond, label %for.cond.cleanup

for.cond.cleanup:                                 ; preds = %for.cond
  ret i64 %x.0

while.cond:                                       ; preds = %while.cond4, %for.cond
  %x.1 = phi i64 [ %x.0, %for.cond ], [ %x.3, %while.cond4 ]
  %cmp1 = icmp slt i64 %x.1, 10
  br i1 %cmp1, label %while.body, label %for.inc

while.body:                                       ; preds = %while.cond
  %0 = and i64 %x.1, 1
  %cmp2 = icmp eq i64 %0, 0
  %add = add nsw i64 %x.1, 2
  %add3 = add nsw i64 %x.1, 1
  %x.2 = select i1 %cmp2, i64 %add, i64 %add3
  br label %while.cond4

while.cond4:                                      ; preds = %while.body6, %while.body
  %x.3 = phi i64 [ %x.2, %while.body ], [ %sub, %while.body6 ]
  %cmp5 = icmp sgt i64 %x.3, 20
  br i1 %cmp5, label %while.body6, label %while.cond

while.body6:                                      ; preds = %while.cond4
  %sub = add nsw i64 %x.3, -5
  br label %while.cond4

for.inc:                                          ; preds = %while.cond
  %inc = add nuw nsw i32 %i.0, 1
  br label %for.cond
}

When examining the PN %x.3 = phi i64 [ %x.2, %while.body ], [ %sub, %while.body6 ] in getPredicateAt, the constant range returned by LVI for the first incoming value %x.2 is [INT_MIN, 12) which is always less than [20,21). However, for %sub the constant range is [16, UINT_MAX) which is not less than [20,21). Doesn't this second range look suboptimal?