Open Quuxplusone opened 4 years ago
Bugzilla Link | PR44544 |
Status | NEW |
Importance | P normal |
Reported by | Jakob Schwarz (jakobschwarz@yahoo.com) |
Reported on | 2020-01-14 08:14:35 -0800 |
Last modified on | 2020-02-05 18:04:10 -0800 |
Version | trunk |
Hardware | PC Linux |
CC | blitzrakete@gmail.com, craig.topper@gmail.com, dgregor@apple.com, erik.pilkington@gmail.com, lebedev.ri@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, richard-llvm@metafoo.co.uk, spatel+llvm@rotateright.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
Are you testing that on actual -march=skylake-avx512 hardware?
If not, i would say it is very likely that it made use of AVX512 instructions
like you asked it to, but executing them on AVX512-less machine leads to
this unexpected behavior.
Speculatively moving to x86 component (if there is a bug, it's most likely in vector codegen).
Yes, it first occured with -march=native on my skylake-AVX512 machine.
This appears to be a vectorizer issue. I'm seeing a wide 64 x i64 load followed by 6 gather/scatter pairs. And then a 64xi64 store using data that originated from the 64xi64 load with some masking applied. This store fully clobbered the updates the 6 scatters because it used the load value from before the scatters. I'll try to put together some more information tomorrow.