Closed Quuxplusone closed 4 years ago
Bugzilla Link | PR42035 |
Status | RESOLVED FIXED |
Importance | P enhancement |
Reported by | David Bolvansky (david.bolvansky@gmail.com) |
Reported on | 2019-05-27 10:21:31 -0700 |
Last modified on | 2020-05-28 06:13:24 -0700 |
Version | trunk |
Hardware | PC Linux |
CC | craig.topper@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, spatel+llvm@rotateright.com |
Fixed by commit(s) | rG410667f1b74c |
Attachments | |
Blocks | |
Blocked by | |
See also | PR38788, PR38522 |
Current codegen https://godbolt.org/z/7x53rr
(In reply to David Bolvansky from comment #0)
> // test for zero [ptest x, x]
> bool bad2(__v16qu x) {
> return __builtin_ia32_ptestz128(~__v16qu(), x == 0);
> }
>
> bad2(unsigned char __vector(16)): // equivalent to testing scalars for 0
> vpxor %xmm1, %xmm1, %xmm1
> vpcmpeqb %xmm1, %xmm0, %xmm0
> vpcmpeqd %xmm1, %xmm1, %xmm1
> vptest %xmm0, %xmm1
> sete %al
> ret
I'm not sure this proposal is correct.
__builtin_ia32_ptestz128(~__v16qu(), x == 0) performs:
if any of the elements of <16 x i8> x == 0 (elt result == -1) then return 0
if all of the elements of <16 x i8> x != 0 (elt result == 0) then return 1
__builtin_ia32_ptestz128(x, x) performs:
if any of the 128 bits of x is true then return 0
if all of the 128 bits of x is false then return 1
(In reply to David Bolvansky from comment #0)
> using __v16qu [[gnu::vector_size(16)]] = unsigned char;
>
> // test sign bit [ptest 0x808080..., x]
> bool bad(__v16qu x) {
> return __builtin_ia32_ptestz128(~__v16qu(), x > 0x7f);
> }
For sign bit only testing we're probably better off using movmsk:
https://godbolt.org/z/ziJUst
Yeah, movmsk should be better choice.
Resolving - converting the sign bit variant to movmsk landed at D80563/rG410667f1b74c
As discussed, nothing else seems to viable.