Open Quuxplusone opened 4 years ago
Bugzilla Link | PR47874 |
Status | NEW |
Importance | P normal |
Reported by | Jeff Roberts (jeffr@radgametools.com) |
Reported on | 2020-10-16 00:11:46 -0700 |
Last modified on | 2020-10-16 15:41:31 -0700 |
Version | 11.0 |
Hardware | PC Windows NT |
CC | craig.topper@gmail.com, efriedma@quicinc.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, pengfei.wang@intel.com, spatel+llvm@rotateright.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
Could you provide a small reproducer?
Are these just spill or they need to clear the upper 127:32 bits as well?
(In reply to Pengfei Wang from comment #1)
> Could you provide a small reproducer?
> Are these just spill or they need to clear the upper 127:32 bits as well?
It's in the middle of an enormous function (to trigger the spill), sadly.
It does (much later) load all 128 bits from that address, but does a bunch of
*scalar* SSE on it, so the 127:32 bits are never used.
Are up able to share the enormous function?
(In reply to Craig Topper from comment #3)
> Are up able to share the enormous function?
I don't think so - it's most of the entire guts to our codec. LLVM11 seems to
inline a LOT more functions at the expense of size, so I'd have to send the
entire file I think. If I add even one or two noinlines, then the problem goes
away.
Separately, what was the inline threshold changes from 9 to 11?
Synthetic testcase with the described behavior:
#include <emmintrin.h>
void a(__m128 *z, __m128 *z2, int f, int n) {
for (int i = 0; i < n; ++i) {
asm("":::"xmm0","xmm1","xmm2","xmm3","xmm4","xmm5","xmm6",
"xmm7","xmm8","xmm9","xmm10","xmm11","xmm12","xmm13",
"xmm14","xmm15");
z[i] = _mm_add_ss(z[i], _mm_set_ss(__builtin_bit_cast(float, n)));
}
}
I have no idea if that's anything close to the original code, though.
That code is very different, but yeah, the emitted code is very similar - weird promotion to xmm, and then reloading just the original 32-bits...
(I like the trick to force a spill, btw, heh...)