Open Quuxplusone opened 6 years ago
Bugzilla Link | PR37423 |
Status | NEW |
Importance | P enhancement |
Reported by | Fabian Giesen (fabian.giesen@epicgames.com) |
Reported on | 2018-05-11 12:32:39 -0700 |
Last modified on | 2019-10-06 03:05:16 -0700 |
Version | 6.0 |
Hardware | PC Windows NT |
CC | andrea.dibiagio@gmail.com, florian_hahn@apple.com, greg.bedwell@sony.com, hfinkel@anl.gov, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, spatel+llvm@rotateright.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also | PR20225 |
This is still an issue. It looks like we do not manage to fold vector add,and,select chain to vectorizer generates.
Current Codegen: https://godbolt.org/z/R_K7FG
This variant also includes a const loop count: https://godbolt.org/z/OkPkSf
void patternFill_const(int *arr)
{
for (int i = 0; i < 65536; i++)
arr[i] = (i & 1) ? 456 : 123;
}
Here we should definitely keep the loop's indvar as a i32/vXi32 instead of
extending it to i64/vXi64 - we can guarantee that i + #loopvectorelts never
overflows.
It'd improve the loop but probably wouldn't do much to help optimize to a
constant select mask though......