Open Quuxplusone opened 3 years ago
Bugzilla Link | PR50468 |
Status | NEW |
Importance | P enhancement |
Reported by | Craig Topper (craig.topper@gmail.com) |
Reported on | 2021-05-25 11:55:07 -0700 |
Last modified on | 2021-08-24 08:16:32 -0700 |
Version | trunk |
Hardware | PC All |
CC | craig.topper@gmail.com, lebedev.ri@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, pengfei.wang@intel.com, spatel+llvm@rotateright.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
(In reply to Craig Topper from comment #0)
> For i16/i32/i64 we can probably just use an isel pattern instead like we do
> for scalar shift left. Not sure how to preserve i8.
The i8 case is repeated in LowerScalarImmediateShift, so we can address that
one separately.
Any ideas about how to make this bug visible?
If we fix this in SDAG, does that guarantee that we're safe to asm?
What about just inserting a ISD::FREEZE node? That would work the vXi8 cases as
well.
// Hardware support for vector shifts is sparse which makes us scalarize the
// vector operations in many cases. Also, on sandybridge ADD is faster than
// shl.
// (shl V, 1) -> add V,V
if (auto *N1BV = dyn_cast<BuildVectorSDNode>(N1))
if (auto *N1SplatC = N1BV->getConstantSplatNode()) {
assert(N0.getValueType().isVector() && "Invalid vector shift type");
// We shift all of the values by one. In many cases we do not have
// hardware support for this operation. This is better expressed as an ADD
// of two values.
if (N1SplatC->isOne()) {
N0 = DAG.getNode(ISD::FREEZE, SDLoc(N), VT, N0);
return DAG.getNode(ISD::ADD, SDLoc(N), VT, N0, N0);
}
}
Freeze Patch: https://reviews.llvm.org/D106675
ISEL Patch: https://reviews.llvm.org/D106679 - this only works for vXi16/32/64 vectors - the vXi8 case would still need ISD::FREEZE
@lebedev.ri also noticed that we have isel patterns doing the same for scalars:
The vXi8 case has been fixed with FREEZE: https://reviews.llvm.org/D108139