"Cannot select" for bitcasts of AVX data types


Bugzilla Link	PR10073
Status	RESOLVED DUPLICATE of bug 2314
Importance	P normal
Reported by	Ralf Karrenberg (karrenberg@cs.uni-saarland.de)
Reported on	2011-06-03 04:23:16 -0700
Last modified on	2011-06-03 13:21:19 -0700
Version	trunk
Hardware	PC All
CC	llvm-bugs@lists.llvm.org, nadav.rotem@me.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

The AVX backend gets confused about mask code as e.g. produced by VCMPPS
together with mask operations and corresponding bitcasts.

Masks that are represented as <8 x i32> should be able to be modified by
xor/and/or which should get lowered to VXORPS/VANDPS/VORPS.
It could also make sense to allow these to operate on <8 x float>, matching the
C intrinsics of immintrin.h (_mm256_cmpgt_ps etc. produce __m256 instead of
__m256i, _mm256_xor_ps takes __m256 operators as well) and LLVM's own
intrinsics (llvm.x86.avx.cmp.ps.256 produces <8 x float>,
llvm.x86.avx.blendv.ps.256 takes an <8 x float> operand as condition).

Currently, code generation for most of these operations fails with "Cannot
select" messages for a cast operation, which could mean that LLVM is only
confused about the required types, not about the bit operations.

Consider these examples:

define <8 x float> @test1(<8 x float> %a, <8 x float> %b, <8 x i32> %m)
nounwind readnone {
entry:
   %cmp = tail call <8 x float> @llvm.x86.avx.cmp.ps.256(<8 x float> %a,
<8 x float> %b, i8 1) nounwind readnone
   %res = tail call <8 x float> @llvm.x86.avx.blendv.ps.256(<8 x float>
%a, <8 x float> %b, <8 x float> %cmp) nounwind readnone
   ret <8 x float> %res
}

This works fine and "llc -filetype=asm -mattr=avx" produces the expected
assembly (VCMPLTPS + VBLENDVPS).

On the other hand, this does not work:

define <8 x float> @test2(<8 x float> %a, <8 x float> %b, <8 x i32> %m)
nounwind readnone {
entry:
   %cmp = tail call <8 x float> @llvm.x86.avx.cmp.ps.256(<8 x float> %a,
<8 x float> %b, i8 1) nounwind readnone
   %cast = bitcast <8 x float> %cmp to <8 x i32>
   %mask = and <8 x i32> %cast, %m
   %blend_cond = bitcast <8 x i32> %mask to <8 x float>
   %res = tail call <8 x float> @llvm.x86.avx.blendv.ps.256(<8 x float>
%a, <8 x float> %b, <8 x float> %blend_cond) nounwind readnone
   ret <8 x float> %res
}

This should produce VCMPLTPS, VANDPS, BLENDVPS, while llc (2.9 final as well as
latest trunk) bails out with:

LLVM ERROR: Cannot select: 0x2510540: v8f32 = bitcast 0x2532270 [ID=16]
   0x2532270: v4i64 = and 0x2532070, 0x2532170 [ID=15]
     0x2532070: v4i64 = bitcast 0x2510740 [ID=14]
       0x2510740: v8f32 = llvm.x86.avx.cmp.ps.256 0x2510640, 0x2511340,
0x2510f40, 0x2511140 [ORD=3] [ID=12]
...

The same counts for or and xor.
However, one specific example works:

define <8 x float> @test3(<8 x float> %a, <8 x float> %b, <8 x i32> %m)
nounwind readnone {
entry:
   %cond = xor <8 x i32> %m, %m
   %res = tail call <8 x float> @llvm.x86.avx.blendv.ps.256(<8 x float>
%a, <8 x float> %b, <8 x float> %cond) nounwind readnone
   ret <8 x float> %res
}

This produces the expected (VXORPS + BLENDVPS), but the same fails for and/or.
In this case, no casting is required, which indicates that this is the actual
problem, not the instruction selection of the xor.

Apparently, LLVM is generally unable to handle bitcasts between <8 x i32> and
<8 x float> (and <4 x i64> vs. <4 x double>), which should always be allowed
for AVX as nops.

Quuxplusone / LLVMBugzillaTest

"Cannot select" for bitcasts of AVX data types #10386