Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

The __m64 not passed according to i386 ABI #39999

Open Quuxplusone opened 5 years ago

Quuxplusone commented 5 years ago
Bugzilla Link PR41029
Status NEW
Importance P normal
Reported by Wei Xiao (wei3.xiao@intel.com)
Reported on 2019-03-11 02:47:50 -0700
Last modified on 2021-01-09 11:42:35 -0800
Version trunk
Hardware PC Linux
CC craig.topper@gmail.com, hjl.tools@gmail.com, jyknight@google.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, neeilans@live.com, richard-llvm@metafoo.co.uk, wei3.xiao@intel.com
Fixed by commit(s)
Attachments
Blocks
Blocked by PR42319
See also
$ cat m64.c
#include <immintrin.h>
void callee(__m64 __m1, __m64 __m2);
__m64 caller(__m64 __m1, __m64 __m2)
{
  __m64 a = _mm_set_pi16(1, 2, 3, 4);
  callee(__m2, __m1);
  return a;
}
$ gcc -m32 -O2 -S -o - m64.c
        .file   "m64.c"
        .text
        .p2align 4
        .globl  caller
        .type   caller, @function
caller:
.LFB5121:
        .cfi_startproc
        subl    $12, %esp
        .cfi_def_cfa_offset 16
        movq    %mm0, %mm2
        movq    %mm1, %mm0
        movq    %mm2, %mm1
        call    callee
        movq    .LC0, %mm0
        addl    $12, %esp
        .cfi_def_cfa_offset 4
        ret
        .cfi_endproc
.LFE5121:
        .size   caller, .-caller
        .section        .rodata.cst8,"aM",@progbits,8
        .align 8
.LC0:
        .value  4
        .value  3
        .value  2
        .value  1
        .ident  "GCC: (GNU) 9.0.1 20190131 (experimental)"
        .section        .note.GNU-stack,"",@progbits

$ clang -m32 -O2 -S -o - m64.c
        .text
        .file   "m64.c"
        .globl  caller                  # -- Begin function caller
        .p2align        4, 0x90
        .type   caller,@function
caller:                                 # @caller
# %bb.0:                                # %entry
        subl    $12, %esp
        pushl   20(%esp)
        pushl   20(%esp)
        pushl   36(%esp)
        pushl   36(%esp)
        calll   callee
        addl    $16, %esp
        movl    $196612, %eax           # imm = 0x30004
        movl    $65538, %edx            # imm = 0x10002
        addl    $12, %esp
        retl
.Lfunc_end0:
        .size   caller, .Lfunc_end0-caller
                                        # -- End function

        .ident  "clang version 9.0.0 (http://llvm.org/git/clang.git 59f2009cd157fc96a0d558125405b98586cd83d2) (http://llvm.org/git/llvm.git 6a7719c7965af52f904e16588c1754f65bcb8ff0)"
        .section        ".note.GNU-stack","",@progbits
        .addrsig

According to i386 ABI, __m64 values should be passed by mmx registers.
Quuxplusone commented 4 years ago

Looks like this was originally changed (broken) in https://github.com/llvm/llvm-project/commit/651c1839ee7b7d72d90982615201dcf6b2299a91 back in 2013.

https://reviews.llvm.org/D59744 is a recent attempt to fix this bug, but was reverted because it broke (at least) chromium on x86-32 -- in part due to to bug 42319.

Quuxplusone commented 3 years ago
I think that it's likely preferable to continue violating this ABI requirement
indefinitely, and not fix this. Clang has already been violating it for 7+
years, and there's not a whole lot of demand to change here.

And, unfortunately, there's a very significant downside to changing, here.
Adding any more usage of MMX is a giant foot-gun, due to the x87/mmx mode-
switching issues.

After PR42320 is implemented, there will be no use of MMX from clang, aside
from inline-assembly. Adding back the hassle of accidental MMX mode-switch when
passing or returning an __m64 would be extremely unfortunate -- it's just not
worth it.

I do think it's unfortunate that GCC's and clang's ABI when built with -mno-mmx
are not compatible.

E.g. given this function:
__m64 mmx() {
    return (__m64)55LL;
}

gcc -O2 -mno-mmx -m32 treats it as if the return type was 'struct X { int a,
int b}':
mmx():
        movl    4(%esp), %eax
        movl    $55, (%eax)
        movl    $0, 4(%eax)
        ret     $4

clang -O2 -mno-mmx -m32 treats it as if the return type were 'long long':
mmx():                                # @mmx()
        movl    $55, %eax
        xorl    %edx, %edx
        retl