[X86] Add pass to insert EMMS/FEMMS instructions to separate MMX and X87 states

llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

http://llvm.org

Other

28.65k stars 11.84k forks source link

[X86] Add pass to insert EMMS/FEMMS instructions to separate MMX and X87 states #41664

Open RKSimon opened 5 years ago

RKSimon commented 5 years ago


Bugzilla Link	42319
Version	trunk
OS	Windows NT
Depends On	llvm/llvm-project#35330
Blocks	llvm/llvm-project#40374
CC	@topperc,@efriedma-quic,@gnzlbg,@zmodem,@jyknight,@RKSimon,@nico,@rjmccall,@rnk,@rotateright

Extended Description

As discussed on https://reviews.llvm.org/D59744, we currently have no way to automatically separate MMX and x87 instructions - we rely on manual insertion of _mm_empty() (EMMS) intrinsics.

Ideally we need something like the X86VZeroUpper pass which can recognise when MMX/X87 instructions have been used, insert EMMS/FEMMS instructions where appropriate and ensure that MMX/X87 ops don't cross the barrier (see Bug #35982).

Given the high cost of EMMS, we may want to make this pass opt-in - for example only enable it by default for the i386 ABI change (see Bug #41029).

jyknight commented 3 years ago

I believe that this proposal is not worth doing -- that, instead, the way to go is to start deprecating and removing x86mmx support from llvm, instead.

Starting, first, by flipping the Clang mm* builtins to use SSE2 instead of MMX for llvm/llvm-bugzilla-archive#42320 , and culminating eventually in removing the x86mmx type from LLVM IR entirely.

Which leaves assembly (either inline or out-of-line) as the only way to generate MMX instructions in LLVM. And since such currently-existing assembly doesn't annotate itself in any way to indicate that it clobbers the x87/mmx switch state, there's no way for the compiler to insert emms automatically.

zmodem commented 5 years ago

Naive question maybe, but I want to understand this better:

add: pushl %eax movd %mm0, %eax movl %eax, (%esp) fildl (%esp) <-- can't use x87 instruction until MMX state is cleared

What does "MMX state is cleared" mean? We've moved %mm0 into %eax at this point and won't use it again, so it doesn't matter that we clobber it. I thought the problem was just that the x87 and mmx registers are aliased. Is there more to it that makes this not work?

(I think I understand now, see https://reviews.llvm.org/D59744#1550326)

54aefcd4-c07d-4252-8441-723563c8826f commented 5 years ago

@Simon

Ideally we need something like the X86VZeroUpper pass which can recognise when MMX/X87 instructions have been used, insert EMMS/FEMMS instructions where appropriate and ensure that MMX/X87 ops don't cross the barrier

It would be interesting for Rust to be able to use such a pass to prevents errors related to missing EMMS/FEMMS instructions. When using the x86_mmx type on x86_64 targets with SSE enabled, MMX registers and intrinsics are still used, EMMS/FEMMS still need to be inserted appropriately. It would be helpful if such a pass would also work there, and would only insert EMMS/FEMMS instructions if MMX registers are actually used, such that one only pays for the cost of the EMMS/FEMMS instructions if one decides to explicitly use the MMX intrinsics (which is something that most people don't do when targeting SSE targets).

zmodem commented 5 years ago

Naive question maybe, but I want to understand this better:

add: pushl %eax movd %mm0, %eax movl %eax, (%esp) fildl (%esp) <-- can't use x87 instruction until MMX state is cleared

What does "MMX state is cleared" mean? We've moved %mm0 into %eax at this point and won't use it again, so it doesn't matter that we clobber it. I thought the problem was just that the x87 and mmx registers are aliased. Is there more to it that makes this not work?

fadds 8(%esp) popl %eax retl

RKSimon commented 5 years ago

Codegen showing the need to insert _mm_empty() https://godbolt.org/z/zh3R1b

#include <x86intrin.h>

// BAD: mixes MMX + X87 states
float add(__v2si x, float y) {
    return (float)x[0] + y;
}

// GOOD: EMMS separates MMX + X87 states
float add_safe(__v2si x, float y) {
    int i = x[0];
    _mm_empty();
    return (float)i + y;
}

add:
  pushl %eax
  movd %mm0, %eax
  movl %eax, (%esp)
  fildl (%esp) <-- can't use x87 instruction until MMX state is cleared
  fadds 8(%esp)
  popl %eax
  retl

add_safe:
  pushl %eax
  movd %mm0, %eax
  emms
  movl %eax, (%esp)
  fildl (%esp)
  fadds 8(%esp)
  popl

workingjubilee commented 1 year ago

Rust has long-since removed its support for MMX intrinsics. My understanding is that the plan to torch anything but assembly support in LLVM for MMX has taken effect, so I believe this issue and possibly the associated MMX issues can be closed.