llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.92k stars 11.52k forks source link

Reopen of Bug 49389 - Condition not constant folded #90417

Open tesuji opened 4 months ago

tesuji commented 4 months ago

This old issue doesn't seem to be fixed as all: https://bugs.llvm.org/show_bug.cgi?id=49389 Bug 49389 would be reposted below (godbolt link: https://godbolt.org/z/hdza9Pb8z):

#include <stdint.h>
#include <stddef.h>
#include <stdbool.h>

#define N 1234

bool eat_digits(uint8_t s[N]) {
    size_t i = 0;
    while (i < N) {
        uint8_t c = s[i];
        if ('0' <= c && c <= '9') {
            i += 1;
        } else {
            break;
        }
    }
    return i <= N;
}

compiles to:

eat_digits:                             # @eat_digits
        xor     ecx, ecx
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        mov     rax, rcx
        cmp     rcx, 1234
        je      .LBB0_3
        movzx   edx, byte ptr [rdi + rax]
        add     dl, -48
        lea     rcx, [rax + 1]
        cmp     dl, 10
        jb      .LBB0_1
.LBB0_3:
        cmp     rax, 1235
        setb    al
        ret

GCC is able to compile it to:

eat_digits:
        mov     eax, 1
        ret

This is originally from: https://github.com/rust-lang/rust/issues/81432

v01dXYZ commented 4 months ago

It seems using return i <= N instead of break in the loop body "solves" the missing optimisation.

Nevertheless, I wonder which pass could be responsible of that kind of optimisation IndVarSimplify or ConstraintElimination or other ?

I looked at ConstraintElimination and it fails to analyse the following loop pattern:

[ while.cond ]   -->   [ while.body ]
         |                   |
         |                   v
         +---------------> [ while.end ]

but if we duplicate [while.end] into [ while.end.cond ] and [ while.end.body ] which make now [ while.cond ] to be a dominator of [ while.end.cond ] (and same for [ while.body ]/[ while.end.body ]) we get the following IR before ConstraintElimination:

define dso_local i1 @eat_digits(ptr nocapture noundef readonly %s) local_unnamed_addr #0 {
entry:
  br label %while.cond

while.cond:                                       ; preds = %while.body, %entry
  %i.0 = phi i64 [ 0, %entry ], [ %add, %while.body ]
  %exitcond.not = icmp eq i64 %i.0, 1234
  br i1 %exitcond.not, label %common.ret, label %while.body

while.body:                                       ; preds = %while.cond
  %arrayidx = getelementptr inbounds i8, ptr %s, i64 %i.0
  %0 = load i8, ptr %arrayidx, align 1
  %1 = add i8 %0, -48
  %or.cond = icmp ult i8 %1, 10
  %add = add nuw nsw i64 %i.0, 1
  br i1 %or.cond, label %while.cond, label %while.end.2

common.ret:                                       ; preds = %while.cond, %while.end.2
  %common.ret.op = phi i1 [ %cmp7, %while.end.2 ], [ true, %while.cond ]
  ret i1 %common.ret.op

while.end.2:                                      ; preds = %while.body
  %cmp7 = icmp ult i64 %i.0, 1235
  br label %common.ret
}

ConstraintElimination is able to infer that %cmp7 is true now. I suspect the following code is responsible of missing to propagate the condition information for this kind of loop pattern:

https://github.com/llvm/llvm-project/blob/6f02120ac4463e5e0cda25e2aafc485a4fe634ea/llvm/lib/Transforms/Scalar/ConstraintElimination.cpp#L199-L204

v01dXYZ commented 4 months ago

@fhahn Do you think this bug is related to ConstraintElimination ?

v01dXYZ commented 4 months ago

The faulty case manually fed to opt:

define dso_local i1 @eat_digits(ptr nocapture noundef readonly %s) local_unnamed_addr #0 {
entry:
  br label %while.cond

while.cond:                                       ; preds = %while.body, %entry
  %i.0 = phi i64 [ 0, %entry ], [ %add, %while.body ]
  %exitcond.not = icmp eq i64 %i.0, 1234
  br i1 %exitcond.not, label %while.end, label %while.body

while.body:                                       ; preds = %while.cond
  %arrayidx = getelementptr inbounds i8, ptr %s, i64 %i.0
  %0 = load i8, ptr %arrayidx, align 1
  %1 = add i8 %0, -48
  %or.cond = icmp ult i8 %1, 10
  %add = add nuw nsw i64 %i.0, 1
;  %add = add i64 %i.0, 1
  br i1 %or.cond, label %while.cond, label %while.end

while.end:                                        ; preds = %while.body, %while.cond
  %cmp6 = icmp ult i64 %i.0, 1235
  ret i1 %cmp6
}
$ opt faulty_manual.ll -passes="print<scalar-evolution>"

Printing analysis 'Scalar Evolution Analysis' for function 'eat_digits':
Classifying expressions for: @eat_digits
  %i.0 = phi i64 [ 0, %entry ], [ %add, %while.body ]
  -->  {0,+,1}<nuw><nsw><%while.cond> U: [0,1235) S: [0,1235)           Exits: <<Unknown>>              LoopDispositions: { %while.cond: Computable }
  %arrayidx = getelementptr inbounds i8, ptr %s, i64 %i.0
  -->  {%s,+,1}<nw><%while.cond> U: full-set S: full-set                Exits: <<Unknown>>              LoopDispositions: { %while.cond: Computable }
  %0 = load i8, ptr %arrayidx, align 1
  -->  %0 U: full-set S: full-set               Exits: <<Unknown>>              LoopDispositions: { %while.cond: Variant }
  %1 = add i8 %0, -48
  -->  (-48 + %0) U: full-set S: full-set               Exits: <<Unknown>>              LoopDispositions: { %while.cond: Variant }
  %add = add nuw nsw i64 %i.0, 1
  -->  {1,+,1}<nuw><nsw><%while.cond> U: [1,1236) S: [1,1236)           Exits: <<Unknown>>              LoopDispositions: { %while.cond: Computable }
Determining loop execution counts for: @eat_digits
Loop %while.cond: <multiple exits> Unpredictable backedge-taken count.
  exit count for while.cond: i64 1234
  exit count for while.body: ***COULDNOTCOMPUTE***
Loop %while.cond: constant max backedge-taken count is i64 1234
Loop %while.cond: symbolic max backedge-taken count is i64 1234
  symbolic max exit count for while.cond: i64 1234
  symbolic max exit count for while.body: ***COULDNOTCOMPUTE***

ScalarEvolution get everything we need, but it is not used.

v01dXYZ commented 4 months ago

I was able to solve your problems for the case you presented: a loop with a constant bound. The case of a dynamic bound is not yet supported. GCC seems to optimise only the constant case too.