Open RKSimon opened 5 years ago
I believe that this proposal is not worth doing -- that, instead, the way to go is to start deprecating and removing x86mmx support from llvm, instead.
Starting, first, by flipping the Clang mm* builtins to use SSE2 instead of MMX for llvm/llvm-bugzilla-archive#42320 , and culminating eventually in removing the x86mmx type from LLVM IR entirely.
Which leaves assembly (either inline or out-of-line) as the only way to generate MMX instructions in LLVM. And since such currently-existing assembly doesn't annotate itself in any way to indicate that it clobbers the x87/mmx switch state, there's no way for the compiler to insert emms automatically.
Naive question maybe, but I want to understand this better:
add: pushl %eax movd %mm0, %eax movl %eax, (%esp) fildl (%esp) <-- can't use x87 instruction until MMX state is cleared
What does "MMX state is cleared" mean? We've moved %mm0 into %eax at this point and won't use it again, so it doesn't matter that we clobber it. I thought the problem was just that the x87 and mmx registers are aliased. Is there more to it that makes this not work?
(I think I understand now, see https://reviews.llvm.org/D59744#1550326)
@Simon
Ideally we need something like the X86VZeroUpper pass which can recognise when MMX/X87 instructions have been used, insert EMMS/FEMMS instructions where appropriate and ensure that MMX/X87 ops don't cross the barrier
It would be interesting for Rust to be able to use such a pass to prevents errors related to missing EMMS/FEMMS instructions. When using the x86_mmx type on x86_64 targets with SSE enabled, MMX registers and intrinsics are still used, EMMS/FEMMS still need to be inserted appropriately. It would be helpful if such a pass would also work there, and would only insert EMMS/FEMMS instructions if MMX registers are actually used, such that one only pays for the cost of the EMMS/FEMMS instructions if one decides to explicitly use the MMX intrinsics (which is something that most people don't do when targeting SSE targets).
Naive question maybe, but I want to understand this better:
add: pushl %eax movd %mm0, %eax movl %eax, (%esp) fildl (%esp) <-- can't use x87 instruction until MMX state is cleared
What does "MMX state is cleared" mean? We've moved %mm0 into %eax at this point and won't use it again, so it doesn't matter that we clobber it. I thought the problem was just that the x87 and mmx registers are aliased. Is there more to it that makes this not work?
fadds 8(%esp) popl %eax retl
Codegen showing the need to insert _mm_empty() https://godbolt.org/z/zh3R1b
#include <x86intrin.h>
// BAD: mixes MMX + X87 states
float add(__v2si x, float y) {
return (float)x[0] + y;
}
// GOOD: EMMS separates MMX + X87 states
float add_safe(__v2si x, float y) {
int i = x[0];
_mm_empty();
return (float)i + y;
}
add:
pushl %eax
movd %mm0, %eax
movl %eax, (%esp)
fildl (%esp) <-- can't use x87 instruction until MMX state is cleared
fadds 8(%esp)
popl %eax
retl
add_safe:
pushl %eax
movd %mm0, %eax
emms
movl %eax, (%esp)
fildl (%esp)
fadds 8(%esp)
popl
Rust has long-since removed its support for MMX intrinsics. My understanding is that the plan to torch anything but assembly support in LLVM for MMX has taken effect, so I believe this issue and possibly the associated MMX issues can be closed.
Extended Description
As discussed on https://reviews.llvm.org/D59744, we currently have no way to automatically separate MMX and x87 instructions - we rely on manual insertion of _mm_empty() (EMMS) intrinsics.
Ideally we need something like the X86VZeroUpper pass which can recognise when MMX/X87 instructions have been used, insert EMMS/FEMMS instructions where appropriate and ensure that MMX/X87 ops don't cross the barrier (see Bug #35982).
Given the high cost of EMMS, we may want to make this pass opt-in - for example only enable it by default for the i386 ABI change (see Bug #41029).