matter-labs / era-compiler-llvm

ZKsync fork of the LLVM framework
Other
26 stars 14 forks source link

Redundant ADD after folding overflow #590

Open TerryGuo opened 2 months ago

TerryGuo commented 2 months ago

In order to fix problem in #521 , we will end up with two Nodes: one is to do the arithmetic operations (add/sub/mul) and another is to set the flags. If the result of arithmetic operation isn't used, then the arithmetic node will be optimized out, this is expected. If there is contradiction between glue and data dependency, then we do need those two nodes, this is expected as well. However when the result of arithmetic operation is needed and there is NO contradiction, we still get two nodes, the second one becomes redundant now. An example is as below, for input IR:

define i256 @add_test_neg_1(i256 %a, i256 %b, i256 %x, i256 %y) {
entry:
  %res1 = call {i256, i1} @llvm.uadd.with.overflow.i256(i256 %x, i256 %y)
  %res2 = extractvalue {i256, i1} %res1, 0
  %overflow = extractvalue {i256, i1} %res1, 1
  %selected = select i1 %overflow, i256 %a, i256 %b
  %sum = add i256 %res2, %selected
  ret i256 %sum
}

The generated asm code with #521 will be:

     1          .text
     2          .file   "overflow-neg.ll"
     3          .globl  add_test_neg_1                  ; -- Begin function add_test_neg_1
     4  add_test_neg_1:                         ; @add_test_neg_1
     5  ; %bb.0:                                ; %entry
     6          add!    r3, r4, r5
     7          add.lt  r1, r0, r2
     8          add     r3, r4, r1
     9          add     r1, r2, r1
    10          ret

The add in line 8 is redundant and can reuse r5 from line 6.

TerryGuo commented 2 months ago

Same problem happens to #581 as well, for input IR like below:

define void @add_branch_neg_1(i256 %x, i256 %y) {
entry:
  %res1 = call {i256, i1} @llvm.uadd.with.overflow.i256(i256 %x, i256 %y)
  %sum = extractvalue {i256, i1} %res1, 0
  %overflow = extractvalue {i256, i1} %res1, 1
  br i1 %overflow, label %overflow_detected, label %no_overflow_detected

overflow_detected:
  call void @has_overflow(i256 %sum)
  br label %exit

no_overflow_detected:
  call void @has_no_overflow(i256 %sum)
  br label %exit

exit:
  ret void
}

We will get below asm:

    13  add_branch_neg_1:                       ; @add_branch_neg_1
    14  ; %bb.0:                                ; %entry
    15          add     r1, r0, r3
    16          add     r3, r2, r1
    17          add!    r3, r2, r2
    18          jump.lt @.BB1_2
    19  ; %bb.1:                                ; %no_overflow_detected
    20          near_call       r0, @has_no_overflow, @DEFAULT_UNWIND
    21          ret
    22  .BB1_2:                                 ; %overflow_detected
    23          near_call       r0, @has_overflow, @DEFAULT_UNWIND
    24          ret

The line 15, line 16 and line 17 can be optimized to a single ADD operation such as add! r1, r2, r1.

akiramenai commented 2 months ago

The aforementioned test cases are now in llvm/test/CodeGen/EraVM/overflow-neg.ll