Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

x86 backend: bug in shift of vector elements #16359

Closed Quuxplusone closed 4 years ago

Quuxplusone commented 11 years ago
Bugzilla Link PR16360
Status RESOLVED FIXED
Importance P normal
Reported by ili.filippov@gmail.com
Reported on 2013-06-18 11:00:36 -0700
Last modified on 2020-10-20 07:36:06 -0700
Version trunk
Hardware PC Linux
CC babokin@gmail.com, craig.topper@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, michael.hliao@gmail.com, pengfei.wang@intel.com, pjcoup@gmail.com, rafael@espindo.la, spatel+llvm@rotateright.com
Fixed by commit(s)
Attachments foo.c (737 bytes, application/octet-stream)
foo_scalar.c (241 bytes, text/x-csrc)
Blocks
Blocked by
See also
Created attachment 10698
Reproducer of the fail

Clang on x86 has an error on attached test if we compile it with -O0 and -m32
(for -msse4 or -mavx targets).
Result is:
sum = 0
TTT1 = 3ffffffffc000000, 3ffffffffc000000, 3ffffffffc000000, 3ffffffffc000000
TTT2 = fffffffffc000000, fffffffffc000000, fffffffffc000000, fffffffffc000000
But TTT1 TTT2 should be the same.

I suppose that problem is in code generation, because llvm representation of
TTT1 TTT2 differ reasonable (only one number, which is different in the source
and should not affect the result).

But their asm representations differ a lot:
If we compile with:
clang -m32 -O0 foo.c -S -mavx
we will have for TTT1:
        movl    136(%esp), %ecx
        movl    %ecx, %edx
        sarl    $31, %edx
        shrl    %edx
        orl     $1073741823, %edx       # imm = 0x3FFFFFFF
        shrl    %ecx
        orl     $-67108864, %ecx        # imm = 0xFFFFFFFFFC000000
        vmovd   %ecx, %xmm0
        vpinsrd $1, %edx, %xmm0, %xmm0
and for TTT2:
        movl    136(%esp), %ecx
        shrl    $2, %ecx
        orl     $-67108864, %ecx        # imm = 0xFFFFFFFFFC000000
        movl    $-1, %edx
        vmovd   %ecx, %xmm0
        vpinsrd $1, %edx, %xmm0, %xmm0

EDX appears to be different, while it should be the same.
Quuxplusone commented 11 years ago

Attached foo.c (737 bytes, application/octet-stream): Reproducer of the fail

Quuxplusone commented 11 years ago

Attached foo_scalar.c (241 bytes, text/x-csrc): reduced scalar test case

Quuxplusone commented 11 years ago

This also occurs on ARM. I attached a reduced scalar test case.

Quuxplusone commented 10 years ago

This was fixed in r184575, no?

Quuxplusone commented 8 years ago

Confirmed - this was fixed in r184575 (tests regenerated in r261440)