llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.23k stars 12.07k forks source link

[AArch64][GlobalISel] Improve i128 mul generation #115512

Open davemgreen opened 1 week ago

davemgreen commented 1 week ago

i128 multiplies under GlobalISel could be producing more madds if it reassociated the add. https://godbolt.org/z/Wr1r5ez1G

SDAG

        umulh   x8, x0, x2
        madd    x8, x0, x3, x8
        mul     x0, x0, x2
        madd    x1, x1, x2, x8
        ret

GISel

        mul     x9, x0, x3
        mul     x8, x0, x2
        umulh   x10, x0, x2
        madd    x9, x1, x2, x9
        mov     x0, x8
        add     x1, x9, x10
        ret
llvmbot commented 1 week ago

@llvm/issue-subscribers-backend-aarch64

Author: David Green (davemgreen)

i128 multiplies under GlobalISel could be producing more madds if it reassociated the add. https://godbolt.org/z/Wr1r5ez1G SDAG ``` umulh x8, x0, x2 madd x8, x0, x3, x8 mul x0, x0, x2 madd x1, x1, x2, x8 ret ``` GISel ``` mul x9, x0, x3 mul x8, x0, x2 umulh x10, x0, x2 madd x9, x1, x2, x9 mov x0, x8 add x1, x9, x10 ret ```
arsenm commented 1 week ago

I think the multiply splitting needs work, I started #97194 but don't have time to get back to it

tschuett commented 1 week ago
; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
; RUN: llc -mtriple=aarch64-apple-ios -global-isel -stop-before=instruction-select %s -o - | FileCheck %s --check-prefix=PRESELECTION

define i128 @ti128(i128 %a, i128 %b) {
  ; PRESELECTION-LABEL: name: ti128
  ; PRESELECTION: bb.1 (%ir-block.0):
  ; PRESELECTION-NEXT:   liveins: $x0, $x1, $x2, $x3
  ; PRESELECTION-NEXT: {{  $}}
  ; PRESELECTION-NEXT:   [[COPY:%[0-9]+]]:gpr(s64) = COPY $x0
  ; PRESELECTION-NEXT:   [[COPY1:%[0-9]+]]:gpr(s64) = COPY $x1
  ; PRESELECTION-NEXT:   [[COPY2:%[0-9]+]]:gpr(s64) = COPY $x2
  ; PRESELECTION-NEXT:   [[COPY3:%[0-9]+]]:gpr(s64) = COPY $x3
  ; PRESELECTION-NEXT:   [[MUL:%[0-9]+]]:gpr(s64) = G_MUL [[COPY]], [[COPY2]]
  ; PRESELECTION-NEXT:   [[MUL1:%[0-9]+]]:gpr(s64) = G_MUL [[COPY1]], [[COPY2]]
  ; PRESELECTION-NEXT:   [[MUL2:%[0-9]+]]:gpr(s64) = G_MUL [[COPY]], [[COPY3]]
  ; PRESELECTION-NEXT:   [[UMULH:%[0-9]+]]:gpr(s64) = G_UMULH [[COPY]], [[COPY2]]
  ; PRESELECTION-NEXT:   [[ADD:%[0-9]+]]:gpr(s64) = G_ADD [[MUL1]], [[MUL2]]
  ; PRESELECTION-NEXT:   [[ADD1:%[0-9]+]]:gpr(s64) = G_ADD [[ADD]], [[UMULH]]
  ; PRESELECTION-NEXT:   $x0 = COPY [[MUL]](s64)
  ; PRESELECTION-NEXT:   $x1 = COPY [[ADD1]](s64)
  ; PRESELECTION-NEXT:   RET_ReallyLR implicit $x0, implicit $x1
    %c = mul i128 %a, %b
    ret i128 %c
}
;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
; PRESELECTION: {{.*}}