Closed Quuxplusone closed 6 years ago
Bugzilla Link | PR13837 |
Status | RESOLVED FIXED |
Importance | P enhancement |
Reported by | Weiming Zhao (weimingz@codeaurora.org) |
Reported on | 2012-09-13 13:17:55 -0700 |
Last modified on | 2018-01-23 14:06:28 -0800 |
Version | trunk |
Hardware | PC Windows NT |
CC | llvm-bugs@lists.llvm.org, spatel+llvm@rotateright.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also | PR35732 |
Can we resolve this bug? This is handled in IR by SLP vectorization now.
Without SLP, we have:
define <4 x i32> @conv4i(<4 x float> %in) {
entry:
%0 = extractelement <4 x float> %in, i64 0
%conv = fptosi float %0 to i32
%vecinit = insertelement <4 x i32> undef, i32 %conv, i32 0
%1 = extractelement <4 x float> %in, i64 1
%conv1 = fptosi float %1 to i32
%vecinit2 = insertelement <4 x i32> %vecinit, i32 %conv1, i32 1
%2 = extractelement <4 x float> %in, i64 2
%conv3 = fptosi float %2 to i32
%vecinit4 = insertelement <4 x i32> %vecinit2, i32 %conv3, i32 2
%3 = extractelement <4 x float> %in, i64 3
%conv5 = fptosi float %3 to i32
%vecinit6 = insertelement <4 x i32> %vecinit4, i32 %conv5, i32 3
ret <4 x i32> %vecinit6
}
And after:
$ ./opt -slp-vectorizer 13837.ll -S |grep fptosi
%0 = fptosi <4 x float> %in to <4 x i32>
The bug was reported five years ago. Since it's not an issue now, we can close it.
I added a test for this example to prevent regression:
https://reviews.llvm.org/rL323269
Feel free to adjust the target for the test if I got that wrong.