Open bbd73dcd-d435-4edf-a8e0-b7a59cbc548a opened 7 years ago
This is the IR code, which stores the same i32 value in 16 random locations:
define void @foo(i32 %x, <16 x i32*> %addr) { %y = insertelement <16 x i32>undef, i32 %x, i32 0 %y1 = shufflevector <16 x i32>%y, <16 x i32> undef, <16 x i32>zeroinitializer br label %L L: call void @llvm.masked.scatter.v16i32.v16p0i32(<16 x i32> %y1, <16 x i32*> %addr, i32 4, <16 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true, i1 true>) ret void } declare void @llvm.masked.scatter.v16i32.v16p0i32(<16 x i32>, <16 x i32*>, i32, <16 x i1>)
vpbroadcastd %edi, %zmm2 kxnorw %k0, %k0, %k1 kxnorw %k0, %k0, %k2 vpscatterqd %ymm2, (,%zmm0) {%k2} vextracti64x4 $1, %zmm2, %ymm0 <== This instruction is redundant. vpscatterqd %ymm0, (,%zmm1) {%k1}
I checked the similar sequence with Store - it is ok, the problem is only in the Scatter.
Current Codegen: https://simd.godbolt.org/z/YxjKG5
Same as the initial report
Extended Description
This is the IR code, which stores the same i32 value in 16 random locations:
I checked the similar sequence with Store - it is ok, the problem is only in the Scatter.