llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.18k stars 11.13k forks source link

miscompile of non-canonical IR by AArch64 global isel backend #90532

Open regehr opened 2 months ago

regehr commented 2 months ago

here's an odd-looking little fellow:

define i1 @f(i1 %V) {
  %C1 = fcmp false double 0.000000e+00, 0.000000e+00
  %brmerge = select i1 %C1, i1 true, i1 %V
  br i1 %brmerge, label %common.ret, label %SW_C

common.ret:    
  %common.ret.op = phi i1 [ %C, %SW_C ], [ false, %0 ]
  ret i1 %common.ret.op

SW_C:       
  %Y = icmp ult i1 false, false
  %C = icmp ule i1 %Y, false
  br label %common.ret
}

the SDAG backend (correctly) compiles it to this:

_f:            
    mov w8, w0
    mov w0, wzr
    cbnz    wzr, LBB0_3
    tbnz    w8, #0, LBB0_3
    mov w0, #1   
LBB0_3:
    ret

but global isel gives:

_f: 
    movi    d0, #0000000000000000
    fcmp    d0, #0.0
    b.nv    LBB0_3
    tbnz    w0, #0, LBB0_3
    mov w0, #1     
    ret
LBB0_3:
    mov w0, wzr
    ret

which I believe is incorrect. to see this, I'll give Alive's work showing why the function should return 1 when invoked as f(0):

i1 %C1 = #x0 (0)
i1 %brmerge = #x0 (0)
  >> Jump to %SW_C
i1 %Y = #x0 (0)
i1 %C = #x1 (1)
  >> Jump to %common.ret
i1 %common.ret.op = #x1 (1)

but on the other hand, if we use this driver:

#include <stdio.h>

unsigned f(unsigned);

int main(void) {
  printf("%x\n", f(0));
}

then we get:

Johns-MacBook-Pro:reduce regehr$ ~/llvm-project/for-alive/bin/llc foo.ll
Johns-MacBook-Pro:reduce regehr$ clang foo.s driver.c && ./a.out
1
Johns-MacBook-Pro:reduce regehr$ ~/llvm-project/for-alive/bin/llc foo.ll -global-isel
Johns-MacBook-Pro:reduce regehr$ clang foo.s driver.c && ./a.out
0
Johns-MacBook-Pro:reduce regehr$ 

cc @DataCorrupted

llvmbot commented 2 months ago

@llvm/issue-subscribers-backend-aarch64

Author: John Regehr (regehr)

here's an odd-looking little fellow: ```llvm define i1 @f(i1 %V) { %C1 = fcmp false double 0.000000e+00, 0.000000e+00 %brmerge = select i1 %C1, i1 true, i1 %V br i1 %brmerge, label %common.ret, label %SW_C common.ret: %common.ret.op = phi i1 [ %C, %SW_C ], [ false, %0 ] ret i1 %common.ret.op SW_C: %Y = icmp ult i1 false, false %C = icmp ule i1 %Y, false br label %common.ret } ``` the SDAG backend (correctly) compiles it to this: ``` _f: mov w8, w0 mov w0, wzr cbnz wzr, LBB0_3 tbnz w8, #0, LBB0_3 mov w0, #1 LBB0_3: ret ``` but global isel gives: ``` _f: movi d0, #0000000000000000 fcmp d0, #0.0 b.nv LBB0_3 tbnz w0, #0, LBB0_3 mov w0, #1 ret LBB0_3: mov w0, wzr ret ``` which I believe is incorrect. to see this, I'll give Alive's work showing why the function should return 1 when invoked as `f(0)`: ``` i1 %C1 = #x0 (0) i1 %brmerge = #x0 (0) >> Jump to %SW_C i1 %Y = #x0 (0) i1 %C = #x1 (1) >> Jump to %common.ret i1 %common.ret.op = #x1 (1) ``` but on the other hand, if we use this driver: ```c #include <stdio.h> unsigned f(unsigned); int main(void) { printf("%x\n", f(0)); } ``` then we get: ``` Johns-MacBook-Pro:reduce regehr$ ~/llvm-project/for-alive/bin/llc foo.ll Johns-MacBook-Pro:reduce regehr$ clang foo.s driver.c && ./a.out 1 Johns-MacBook-Pro:reduce regehr$ ~/llvm-project/for-alive/bin/llc foo.ll -global-isel Johns-MacBook-Pro:reduce regehr$ clang foo.s driver.c && ./a.out 0 Johns-MacBook-Pro:reduce regehr$ ``` cc @DataCorrupted
tschuett commented 2 months ago

after irtranslator:

body:             |
  bb.1 (%ir-block.0):
    successors: %bb.2(0x40000000), %bb.4(0x40000000)
    liveins: $w0

    %2:_(s32) = COPY $w0
    %1:_(s8) = G_TRUNC %2(s32)
    %3:_(s8) = G_ASSERT_ZEXT %1, 1
    %0:_(s1) = G_TRUNC %3(s8)
    %4:_(s64) = G_FCONSTANT double 0.000000e+00
    %6:_(s1) = G_CONSTANT i1 false
    %8:_(s1) = G_CONSTANT i1 true
    %5:_(s1) = COPY %6(s1)
    %7:_(s1) = G_SELECT %5(s1), %8, %0
    %9:_(s1) = G_FCMP floatpred(false), %4(s64), %4
    G_BRCOND %9(s1), %bb.2
    G_BR %bb.4

  bb.4 (%ir-block.0):
    successors: %bb.2(0x40000000), %bb.3(0x40000000)

    G_BRCOND %0(s1), %bb.2
    G_BR %bb.3

  bb.2.common.ret:
    %12:_(s1) = G_PHI %11(s1), %bb.3, %6(s1), %bb.1, %6(s1), %bb.4
    %13:_(s8) = G_ZEXT %12(s1)
    %14:_(s32) = G_ANYEXT %13(s8)
    $w0 = COPY %14(s32)
    RET_ReallyLR implicit $w0

  bb.3.SW_C:
    successors: %bb.2(0x80000000)

    %10:_(s1) = G_CONSTANT i1 false
    %11:_(s1) = G_CONSTANT i1 true
    G_BR %bb.2

...
tschuett commented 2 months ago

In IR the first conditional branch depends on %brmerge aka select. In MIR, It depends on the fcmp. The select and the fcmp were reordered.

tschuett commented 2 months ago

The condition of the select is now false. The fcmp was constant folded and not probably deleted?

The select is dead.

tschuett commented 2 months ago

The IRTranslator has an optimization for fcmp: https://github.com/llvm/llvm-project/blob/1b942ae3843ca943a249288612e69df0b2fc188b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp#L347

It is the source of the select condition being false. But where was the G_FCMP built and why is it the branch condition.

tschuett commented 2 months ago

The function parameter %V is only used in the select instruction in IR.

In MIR, there is a fourth BB with a conditional branch that depends on %V.