Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

[AArch64] machine combiner converts independent fadds into dependent fadds #36762

Open Quuxplusone opened 6 years ago

Quuxplusone commented 6 years ago
Bugzilla Link PR37789
Status NEW
Importance P enhancement
Reported by Sanjay Patel (spatel+llvm@rotateright.com)
Reported on 2018-06-13 08:34:20 -0700
Last modified on 2018-06-13 11:32:52 -0700
Version trunk
Hardware PC All
CC david.green@arm.com, efriedma@quicinc.com, haicheng@codeaurora.org, llvm-bugs@lists.llvm.org, mcrosier@codeaurora.org, t.p.northover@gmail.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
Copied from:
test/CodeGen/AArch64/fadd-combines.ll

; DAGCombiner transforms this into: (x + 59.0) + (x + 17.0).
; The machine combiner transforms this into a chain of 3 dependent adds:
; ((x + 59.0) + 17.0) + x

define float @fadd_const_multiuse_attr(float %x) #0 {
; CHECK-LABEL: fadd_const_multiuse_attr:
; CHECK:       // %bb.0:
; CHECK-NEXT:    adrp x9, .LCPI8_1
; CHECK-NEXT:    adrp x8, .LCPI8_0
; CHECK-NEXT:    ldr s1, [x9, :lo12:.LCPI8_1]
; CHECK-NEXT:    ldr s2, [x8, :lo12:.LCPI8_0]
; CHECK-NEXT:    fadd s1, s0, s1
; CHECK-NEXT:    fadd s1, s2, s1
; CHECK-NEXT:    fadd s0, s0, s1
; CHECK-NEXT:    ret
  %a1 = fadd float %x, 42.0
  %a2 = fadd float %a1, 17.0
  %a3 = fadd float %a1, %a2
  ret float %a3
}

---------------------------------------------------------------------------

That's with no CPU model specified. When I tried setting that to various
existing models, the transform doesn't happen.

I don't see why we would do this transform under any circumstances, but I might
be missing some AArch subtlety.
Quuxplusone commented 6 years ago

cc'ing some more knowledgeable AArch people.

Also - forgot to include the necessary function attributes in the paste. We're allowed to reassociate the FP math because:

attributes #0 = { "unsafe-fp-math"="true" }

Quuxplusone commented 6 years ago

MachineCombiner reassociation is supposed to try to reduce the length of the critical path; not sure what it's doing in your testcase.