llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.7k stars 11.87k forks source link

[x86] Incorrect 32-bit ABI for vector_size(8) #62994

Open easyaspi314 opened 1 year ago

easyaspi314 commented 1 year ago

According to the SysV ABI, 8 byte vectors are to be passed in mm0-mm2 and returned in mm0, and this is the only scenario where a function is called and/or returned in MMX mode.

Currently it seems to be quite random:

base type simd pass return
float none stack st0:st1
float sse xmm0-2 xmm0
double any stack st0
int64, __m64 any stack eax:edx
int < 64 none stack (bitcast to i64) stack
int < 64 sse2 stack (bitcast to i64) xmm0
LLVM int < 64 none stack stack
LLVM int < 64 sse2 xmm0-2 xmm0
LLVM x86_mmx mmx mm0-2 mm0

Of course this would need to be handled carefully since EMMS is required.

A simple "set it and forget it", albeit a bit inefficient, solution is to treat the function like it takes XMM registers and hardcode a prologue and epilogue to switch between them. Perhaps this could be optimized out if there is a direct bitcast to/from x86_mmx.

caller:
    ...
    movdq2q mm0, xmm0
    movdq2q mm1, xmm1
    call   callee
    movq2dq xmm0, mm0
    emms
    ...
callee:
    movq2dq xmm0, mm0
    movq2dq xmm1, mm1
    emms
    ...
    movdq2q mm0, xmm0
    ret
llvmbot commented 1 year ago

@llvm/issue-subscribers-backend-x86

easyaspi314 commented 1 year ago

So apparently the MMX register passing is implemented, but it must be x86_mmx, which is not a natural type in C since __m64 encodes to long long vector_size(8).

However, both GCC and ICC pass vector_size(8) in MMX (although they also have issues with state changing themselves).