llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.8k stars 11.91k forks source link

Inline asm miscompile (test segfaults) #50287

Closed pogo59 closed 3 years ago

pogo59 commented 3 years ago
Bugzilla Link 50943
Resolution INVALID
Resolved on Jun 30, 2021 10:57
Version trunk
OS Linux
CC @adibiagio,@topperc,@RKSimon,@phoebewang,@rotateright

Extended Description

Reduced from gdb test case gdb.arch/i386-avx.c

$ cat t1.c typedef struct { float f[8]; } v8sf_t; v8sf_t data[] = { { { 0.0, 0.125, 0.25, 0.375, 0.50, 0.625, 0.75, 0.875 } }, }; int main (int argc, char *argv) { asm ("vmovaps 0(%0), %%ymm0\n\t" : / no output operands */ : "r" (data) : "xmm0"); return 0; } $ gcc t1.c -o t1-gcc $ ./t1-gcc $ clang t1.c -o t1-clang $ ./t1-clang Segementation fault $

using gcc 7.5.0 and clang 13 1f169a77

pogo59 commented 3 years ago

I'm not sure there's an alignment requirement for that array other than what is implied by float. Versions of gcc prior to 5 use align 16 with -Os and align 32 with -O2.

Adding attribute ((aligned (32))) to the struct definition also appears to fix the crash.

I was coming to the same conclusion. I threw in some alignof() expressions and they all report 4. Pumping up the alignment in the generated code seems rather optional.

I'll resolve this as invalid and we'll patch the test to enforce the required alignment.

topperc commented 3 years ago

I'm not sure there's an alignment requirement for that array other than what is implied by float. Versions of gcc prior to 5 use align 16 with -Os and align 32 with -O2.

Adding attribute ((aligned (32))) to the struct definition also appears to fix the crash.

adibiagio commented 3 years ago

I suspect that there is an alignment problem here. If clang is not aligning the array to 32-bytes, then it is a problem for the VMOVAPS.

Out of curiosity, I have built the reproducible posted by Paul on compiler explorer.

I noticed that the array is emitted by clang with assembler directive

.p2align 4

That directive only enforces a 16-byte alignment. Unfortunately the VMOVAPS from that code expects the operand to be aligned on a 32-byte boundary.

GCC, on the other hand, explicitly aligns the array to a 32-byte boundary using the following assembly directive:

.align 32

I mentioned this to Paul, and I have suggested as an experiment to replace the VMOVAPS with VMOVUPS. That was enough to fix his crash.

So, I guess that there may be a problem with the alignment set by clang for that array.

pogo59 commented 3 years ago

$ objdump -d t1-gcc ... 00000000000005fa

: 5fa: 55 push %rbp 5fb: 48 89 e5 mov %rsp,%rbp 5fe: 89 7d fc mov %edi,-0x4(%rbp) 601: 48 89 75 f0 mov %rsi,-0x10(%rbp) 605: 48 8d 05 14 0a 20 00 lea 0x200a14(%rip),%rax # 201020 60c: c5 fc 28 00 vmovaps (%rax),%ymm0 610: b8 00 00 00 00 mov $0x0,%eax 615: 5d pop %rbp 616: c3 retq
617: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1) 61e: 00 00 ... $ objdump -d t1-clang ... 0000000000400480
: 400480: 55 push %rbp 400481: 48 89 e5 mov %rsp,%rbp 400484: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp) 40048b: 89 7d f8 mov %edi,-0x8(%rbp) 40048e: 48 89 75 f0 mov %rsi,-0x10(%rbp) 400492: b8 30 10 60 00 mov $0x601030,%eax 400497: c5 fc 28 00 vmovaps (%rax),%ymm0 40049b: 31 c0 xor %eax,%eax 40049d: 5d pop %rbp 40049e: c3 retq
40049f: 90 nop ...

gcc sets up the address of data as 605: 48 8d 05 14 0a 20 00 lea 0x200a14(%rip),%rax while clang produces 400492: b8 30 10 60 00 mov $0x601030,%eax