llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.31k stars 11.69k forks source link

[AVX-512] Redundant instruction in Scatter sequence #33890

Open bbd73dcd-d435-4edf-a8e0-b7a59cbc548a opened 7 years ago

bbd73dcd-d435-4edf-a8e0-b7a59cbc548a commented 7 years ago
Bugzilla Link 34542
Version trunk
OS All
CC @chriselrod,@RKSimon,@ZviRackover

Extended Description

This is the IR code, which stores the same i32 value in 16 random locations:

define void @foo(i32 %x, <16 x i32*> %addr) {
  %y = insertelement <16 x i32>undef, i32 %x, i32 0
  %y1 = shufflevector <16 x i32>%y, <16 x i32> undef, <16 x i32>zeroinitializer
  br label %L
  L:
  call void @llvm.masked.scatter.v16i32.v16p0i32(<16 x i32> %y1, <16 x i32*> %addr, i32 4, <16 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true>)
  ret void
}
declare void @llvm.masked.scatter.v16i32.v16p0i32(<16 x i32>, <16 x i32*>, i32, <16 x i1>)
  vpbroadcastd    %edi, %zmm2
  kxnorw  %k0, %k0, %k1
  kxnorw  %k0, %k0, %k2
  vpscatterqd     %ymm2, (,%zmm0) {%k2}
  vextracti64x4   $1, %zmm2, %ymm0      <== This instruction is redundant.
  vpscatterqd     %ymm0, (,%zmm1) {%k1}

I checked the similar sequence with Store - it is ok, the problem is only in the Scatter.

RKSimon commented 3 years ago

Current Codegen: https://simd.godbolt.org/z/YxjKG5

Same as the initial report