Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Inline asm miscompile (test segfaults) #49912

Closed Quuxplusone closed 3 years ago

Quuxplusone commented 3 years ago
Bugzilla Link PR50943
Status RESOLVED INVALID
Importance P enhancement
Reported by Paul Robinson (paul_robinson@playstation.sony.com)
Reported on 2021-06-30 08:51:15 -0700
Last modified on 2021-06-30 10:57:34 -0700
Version trunk
Hardware PC Linux
CC andrea.dibiagio@gmail.com, craig.topper@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, pengfei.wang@intel.com, spatel+llvm@rotateright.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
Reduced from gdb test case gdb.arch/i386-avx.c

$ cat t1.c
typedef struct {
  float f[8];
} v8sf_t;
v8sf_t data[] =
  {
    { {  0.0,  0.125,  0.25,  0.375,  0.50,  0.625,  0.75,  0.875 } },
  };
int main (int argc, char **argv) {
      asm ("vmovaps 0(%0), %%ymm0\n\t"
           : /* no output operands */
           : "r" (data)
           : "xmm0");
      return 0;
}
$ gcc t1.c -o t1-gcc
$ ./t1-gcc
$ clang t1.c -o t1-clang
$ ./t1-clang
Segementation fault
$

using gcc 7.5.0 and clang 13 1f169a77
Quuxplusone commented 3 years ago

$ objdump -d t1-gcc ... 00000000000005fa

: 5fa: 55 push %rbp 5fb: 48 89 e5 mov %rsp,%rbp 5fe: 89 7d fc mov %edi,-0x4(%rbp) 601: 48 89 75 f0 mov %rsi,-0x10(%rbp) 605: 48 8d 05 14 0a 20 00 lea 0x200a14(%rip),%rax # 201020 60c: c5 fc 28 00 vmovaps (%rax),%ymm0 610: b8 00 00 00 00 mov $0x0,%eax 615: 5d pop %rbp 616: c3 retq
617: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1) 61e: 00 00 ... $ objdump -d t1-clang ... 0000000000400480
: 400480: 55 push %rbp 400481: 48 89 e5 mov %rsp,%rbp 400484: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp) 40048b: 89 7d f8 mov %edi,-0x8(%rbp) 40048e: 48 89 75 f0 mov %rsi,-0x10(%rbp) 400492: b8 30 10 60 00 mov $0x601030,%eax 400497: c5 fc 28 00 vmovaps (%rax),%ymm0 40049b: 31 c0 xor %eax,%eax 40049d: 5d pop %rbp 40049e: c3 retq
40049f: 90 nop ...

gcc sets up the address of data as 605: 48 8d 05 14 0a 20 00 lea 0x200a14(%rip),%rax while clang produces 400492: b8 30 10 60 00 mov $0x601030,%eax

Quuxplusone commented 3 years ago

I suspect that there is an alignment problem here. If clang is not aligning the array to 32-bytes, then it is a problem for the VMOVAPS.

Out of curiosity, I have built the reproducible posted by Paul on compiler explorer.

I noticed that the array is emitted by clang with assembler directive

.p2align 4

That directive only enforces a 16-byte alignment. Unfortunately the VMOVAPS from that code expects the operand to be aligned on a 32-byte boundary.

GCC, on the other hand, explicitly aligns the array to a 32-byte boundary using the following assembly directive:

.align 32

I mentioned this to Paul, and I have suggested as an experiment to replace the VMOVAPS with VMOVUPS. That was enough to fix his crash.

So, I guess that there may be a problem with the alignment set by clang for that array.

Quuxplusone commented 3 years ago

I'm not sure there's an alignment requirement for that array other than what is implied by float. Versions of gcc prior to 5 use align 16 with -Os and align 32 with -O2.

Adding attribute ((aligned (32))) to the struct definition also appears to fix the crash.

Quuxplusone commented 3 years ago
(In reply to Craig Topper from comment #3)
> I'm not sure there's an alignment requirement for that array other than what
> is implied by float. Versions of gcc prior to 5 use align 16 with -Os and
> align 32 with -O2.
>
> Adding __attribute__ ((aligned (32))) to the struct definition also appears
> to fix the crash.

I was coming to the same conclusion.  I threw in some alignof() expressions
and they all report 4.  Pumping up the alignment in the generated code seems
rather optional.

I'll resolve this as invalid and we'll patch the test to enforce the required
alignment.