Closed Quuxplusone closed 7 years ago
Hi Raj, this is an encoding problem.
Both ZR and SP registers are encoded as 31, and depending on the instructions,
they're either zero or stack.
ADDri can operate on the stack pointer, which means XZR is an invalid operand.
ADDrr can operate on the zero register, which means #1 is an invalid operand.
Clang "works" when outputting the object file directly probably because the
encoding 31 is also valid for SP, so it emits that. But from your large file, I
can't tell which one would be the correct encoding. It's also possible that
none are (sp+#1 or zr+x?) and this got there from a whole different pattern.
It'd be good to have a reduced case, and possibly a bisection.
cheers,
--renato
I'm pretty sure xzr is intended. The sequence
add x1, xzr, #1 // =1
add x2, xzr, #1 // =1
bl _ZN2kj1_17HeapArrayDisposer12allocateImplEmmmPFvPvES4_
is marshalling "elementCount" and "capacity" into x1 and x2. There's no reason
for those to be based on sp.
It's difficult to get more detailed info on such a large file though. I'll see
if I can at least llvm-extract the offending function.
Attached simpler.ll
(8788 bytes, application/octet-stream): Isolated function demonstrating issue
replaceZeroVectorStore is doing something weird with XZR; instead of using a CopyFromReg to read the value of the register, it's using the register itself as an operand to an ISD::STORE. This somehow works in some cases, but it's not really correct, and it eventually bites us here.
Not sure if there's any good way to detect this sort of situation... maybe registers should have a different MVT?
Taking a look now...
As Eli alluded to, this bug is triggered if store->load forwarding happens
after the store vector 0 split optimization and the loaded value feeds an add:
target triple = "aarch64--linux-gnu"
define i64 @foo(<2 x i64>* %p) {
entry:
store <2 x i64> zeroinitializer, <2 x i64>* %p
%p2 = bitcast <2 x i64>* %p to i64*
%ld = load i64, i64* %p2
%add = add i64 %ld, 1
ret i64 %add
}
The store vector 0 split optimization replaces the stored vreg with XZR, then
the store->load forwarding replaces the load with XZR, which if the load is
used by an add ends up generating the invalid instruction:
add x0, xzr, #1
The store vector 0 split optimization inserts XZR operands to prevent
DAGCombiner::MergeConsecutiveStores from undoing the splitting of vector
stores, but in light of this bug that doesn't seem like a good idea. I'm
looking into whether this can be achieved through other means or if not if this
vector store splitting should just be moved to after ISel.
I think you get the behavior you want if you just replace the call to getRegister() with a call to getCopyFromReg()?
It would also be easy to clean up after isel by pattern-matching the store instruction, if you want to go that route.
Yeah, inserting a CopyFromReg is a good idea. Patch up for review here:
https://reviews.llvm.org/D38146
Should be resolved by r313916
thanks, I would like to request to backport this to release_50
Adding as a blocker to 5.0.1 meta.
Removing the release directly, created a merge bug, copied hans.
_Bug 34675 has been marked as a duplicate of this bug._
simpler.ll
(8788 bytes, application/octet-stream)