Closed Quuxplusone closed 5 years ago
Attached tc_invalidtrunc.ll
(4782 bytes, text/plain): reduced test case
The involved DAG nodes are:
t124: i16 = urem t122, Constant:i16<29265>
t125: v8i16 = BUILD_VECTOR t124, t124, t124, t124, t124, t124, t124, t124
t112: i32 = extract_vector_elt t125, Constant:i32<0>
Arguments for call to combineExtract():
N = t112
Op0 = t125
IndexN = 0
VecVT = llvm::MVT::v8i16
return combineExtract(SDLoc(N), N->getValueType(0), VecVT, Op0,
IndexN->getZExtValue(), DCI, false);
In SystemZTargetLowering::combineExtract():
// The number of bytes being extracted.
unsigned BytesPerElement = VecVT.getVectorElementType().getStoreSize();
...
} else if (Opcode == ISD::BUILD_VECTOR &&
canTreatAsByteVector(Op.getValueType())) {
// We can only optimize this case if the BUILD_VECTOR elements are
// at least as wide as the extracted value.
EVT OpVT = Op.getValueType();
unsigned OpBytesPerElement = OpVT.getVectorElementType().getStoreSize();
if (OpBytesPerElement < BytesPerElement)
break;
(gdb) p BytesPerElement
$12 = 2
(gdb) p Op->dump()
t125: v8i16 = BUILD_VECTOR t124, t124, t124, t124, t124, t124, t124, t124
$20 = void
(gdb) p OpVT
$21 = {V = {SimpleTy = llvm::MVT::v8i16}, LLVMTy = 0x0}
(gdb) n
(gdb) p OpBytesPerElement
$22 = 2
The check that "the BUILD_VECTOR elements are at least as wide as the extracted
value" are done with OpBytesPerElement and BytesPerElement, but ResVT is wider:
EVT VT = MVT::getIntegerVT(ResVT.getSizeInBits());
Op = DAG.getNode(ISD::TRUNCATE, DL, VT, Op);
(gdb) p VT
$24 = {V = {SimpleTy = llvm::MVT::i32}, LLVMTy = 0x0}
(gdb) p Op->dump()
t124: i16 = urem t122, Constant:i16<29265>
It seems that a check may be missing that ResVT.getStoreSize() is not larger
than OpBytesPerElement?
Alternatively, it would be legal to extend an integer type, since in this case
the high bits are undefined per the SelectionDAG semantics, but this is
probably not useful enough to motivate handling this...
(In reply to Jonas Paulsson from comment #1)
> The involved DAG nodes are:
>
> t124: i16 = urem t122, Constant:i16<29265>
> t125: v8i16 = BUILD_VECTOR t124, t124, t124, t124, t124, t124, t124, t124
> t112: i32 = extract_vector_elt t125, Constant:i32<0>
Ah, this may be the problem right here: we have a vector of type v8i16, so a
single element should be of type i16, but the type of the extract_vector_elt is
i32.
Now, either this is just a bug; then this should be fixed at the place that
extract_vector_elt was originally created.
But it could also be that this is intentional (I guess the definition of
extract_vector_elt allows it), in particular because i16 isn't actually a legal
type so after legalization we probably *have* to use i32 here anyway.
In that case, the code probably should insert an ANY_EXTEND as required ...
It seems that this starts out as a v16i16 urem, which has an illegal type. The divisor is a vector constant splat, so SystemZTargetLowering::combineIntDIVREM() would have unrolled this if the type had been legal.
combineIntDIVREM() however then is called again after type legalization, at which point the vector type is legal: v8i16. So now it unrolls it, which seems like a bug, since then we get an i16 urem, which has an illegal type (after type legalization).
Suggested patch: https://reviews.llvm.org/D62036.
Not quite sure if we need to fix combineExtract() - it seems that since ResVT may legally be wider than the BUILD_VECTOR element, we should maybe also fix that to do an ANY_EXTEND. Or is that supposed to never happen?
Suggested patch for combineIntDIVREM() committed as r360965, which handles the test case.
Can this be abandoned now (or should we fix also combineExtract() to handle an extending integer extraction)?
(In reply to Jonas Paulsson from comment #4)
> Suggested patch for combineIntDIVREM() committed as r360965, which handles
> the test case.
>
> Can this be abandoned now (or should we fix also combineExtract() to handle
> an extending integer extraction)?
IMO this can be abandoned now (until we run into this again).
tc_invalidtrunc.ll
(4782 bytes, text/plain)