llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.79k stars 11.9k forks source link

[PPC] recognize the shufflevector equivalent of a vector select #28905

Open rotateright opened 8 years ago

rotateright commented 8 years ago
Bugzilla Link 28531
Version trunk
OS All
CC @echristo,@hfinkel,@nemanjai

Extended Description

$ cat shufsel.ll define <4 x i32> @​foo(<4 x i32> %a, <4 x i32> %b) { %sel = select <4 x i1> <i1 true, i1 false, i1 false, i1 true>, <4 x i32> %a,

<4 x i32> %b ret <4 x i32> %sel } define <4 x i32> @​goo(<4 x i32> %a, <4 x i32> %b) { %shuf = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> ret <4 x i32> %shuf } These are functionally equivalent. Note that in http://reviews.llvm.org/D22114 , there's a proposal to canonicalize to the shufflevector form of the IR. We generate 'xxsel' and 'vperm' for a target with VSX. Is one better than the other? For an altivec target, the lowering to vsel is missed: $ ./llc shufsel.ll -o - -mtriple=powerpc64 -mattr=altivec foo: # @​foo addis 3, 2, .LCPI0_0@toc@ha addis 4, 2, .LCPI0_1@toc@ha addi 3, 3, .LCPI0_0@toc@l addi 4, 4, .LCPI0_1@toc@l lvx 4, 0, 3 lvx 5, 0, 4 vand 3, 3, 4 vand 2, 2, 5 vor 2, 2, 3 blr
rotateright commented 8 years ago

Thanks, Ehsan.

We may want to adapt the code in InstCombineVectorOps.cpp -> isShuffleEquivalentToSelect() for general use in the DAG.

x86 has a more specific shuffle-to-select matching blob of code in its lowerVectorShuffleAsBlend() that might also be worth copying/refactoring.

llvmbot commented 8 years ago

(In reply to comment #​0 the IR.

We generate 'xxsel' and 'vperm' for a target with VSX. Is one better than the other?

vperm is generated for now canonical form (shuffle vector). Xxsel is preferrable as it has access to any of VSX register. That is a superset of VMX registers that vperm has access to.

rotateright commented 8 years ago

The patch for canonicalization of vector select with constant condition to shuffle is here: https://reviews.llvm.org/D24279

xgupta commented 7 months ago

Latest trunk behaviour - https://godbolt.org/z/8PE46h888. Function foo have vsel and goo have vperm instruction.