llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.95k stars 11.53k forks source link

[SROA][Mem2Reg][EarlyCSE] Missing support for store-splat-to-load-scalar forwarding #102821

Open dtcxzyw opened 1 month ago

dtcxzyw commented 1 month ago

Alive2: https://alive2.llvm.org/ce/z/riEaBT

define float @src(i64 %x) {
  %alloc = alloca <4 x float>, align 16
  store <4 x float> zeroinitializer, ptr %alloc, align 16
  %359 = getelementptr inbounds float, ptr %alloc, i64 %x
  %360 = load float, ptr %359, align 4
  ret float %360
}

define float @tgt(i64 %x) {
  ret float 0.0
}

This pattern is extracted from mitsuba3:mitsuba::Measured<float, drjit::Matrix<mitsuba::Spectrum<float, 4ul>, 4ul> >::sample.

; bin/opt -O3 -disable-loop-unrolling test.ll -S
define void @test() {
entry:
  %alloc = alloca <4 x float>, align 16
  store <4 x float> zeroinitializer, ptr %alloc, align 16
  br label %loop

loop:
  %x = phi i64 [ 0, %entry ], [ %x.inc, %loop ]
  %359 = getelementptr inbounds float, ptr %alloc, i64 %x
  %360 = load float, ptr %359, align 4
  call void @use(float %360)
  %x.inc = add i64 %x, 1
  %cmp = icmp eq i64 %x.inc, 4
  br i1 %cmp, label %exit, label %loop

exit:
  ret void
}

declare void @use(float)
ParkHanbum commented 3 weeks ago

If it's not an array, it seems to be folding in instcombine.

IC: Visiting:   %1 = load float, ptr %alloc, align 4
IC: Replacing   %1 = load float, ptr %alloc, align 4
    with float 0.000000e+00
IC: Mod =   %1 = load float, ptr %alloc, align 4
    New =   %1 = load float, ptr %alloc, align 4
IC: ERASE   %1 = load float, ptr %alloc, align 4
ADD DEFERRED:   %alloc = alloca float, align 16
ADD DEFERRED:   store float 0.000000e+00, ptr %alloc, align 16
ADD:   store float 0.000000e+00, ptr %alloc, align 16
ADD:   %alloc = alloca float, align 16
IC: Visiting:   %alloc = alloca float, align 16
IC: ERASE   store float 0.000000e+00, ptr %alloc, align 16
ADD DEFERRED:   %alloc = alloca float, align 16
IC: ERASE   %alloc = alloca float, align 16
IC: Visiting:   ret float 0.000000e+00

Also, it doesn't seem to work on arrays, regardless of type. https://godbolt.org/z/YWbafYzsd https://alive2.llvm.org/ce/z/R2VmYq