unexpected failure for -mno-sse2

nickdesaulniers commented 5 years ago


Bugzilla Link	43334
Version	trunk
OS	Linux
Blocks	llvm/llvm-project#4440
CC	@arndb,@topperc,@RKSimon,@rotateright,@stephenhines

Extended Description

Consider:

$ cat foo.c // double __floatunsidf (unsigned int i) double foo(unsigned int i) { return (double) i; }

$ clang -O2 -mno-sse2 foo.c error: SSE2 register return with SSE2 disabled return (double) i;

$ gcc -O2 -mno-sse2 foo.c && llvm-objdump -d foo.o foo: movl %edi, %edi movq %rdi, -8(%rsp) fildq -8(%rsp) fstpl -8(%rsp) movlps -8(%rsp), %xmm0 ret

Adding -mno-80387 to clang seems to then complicate things further: clang generates:

foo: # @foo pushq %rax callq __floatunsidf popq %rcx retq

This is important because:

the kernel builds with -mno-see2 -mno-80387 (and a few other related but less relevant flags: https://github.com/ClangBuiltLinux/linux/blob/52a5525214d0d612160154d902956eca0558b7c0/arch/x86/Makefile#L61-L62) since saving/restoring fp register state gets complicated/expensive for all registers added via ISA extensions).
the kernel does not define soft float routines like __floatunsidf. -mhard-float and -ffreestanding don't prevent clang from generating references to these soft float routines.

9816046d-d175-4de2-95d1-ce389e5190e8 commented 5 years ago

Why does the kernel specifically disable sse2 but not sse?

The kernel disables both in most cases; some drivers may use floating point if they explicitly call kernel_fpu_begin()/kernel_fpu_end() and are compiled with -msse re-added. For sse2, I guess the kernel is not guaranteed to be running on an x86_64 with sse2 (ie. could run on x86_64 with no sse2, so don't assume it's there for the purposes of code generation). Sounds like the x86_64 kernel requires sse but not sse2?

I can't think of any x86-64 hardware that did not at least have sse2, it was present in all the early 64-bit microarchitectures (AMD Hammer/Opteron, Intel Prescott/Pentium4, Intel Bonnell/Atom, VIA Isiah). There are also several crypto algorithms in arch/x86/crypto/ that use sse2 (or later).

Generally speaking though, there should be no floating point operations in the kernel, the drivers that use MMX/SSE/SSE2 tend to just do that for vector integer math. The one exception that I know for this a part of the amdgpu driver that should probably get changed not to do so.

nickdesaulniers commented 5 years ago

What does gcc do with -mno-sse2 -mno-80387? foo: subq $8, %rsp call __floatunsidf addq $8, %rsp ret

Make sense; assume no hardware support for xmm*, call into soft-fp routines.

Why does the kernel specifically disable sse2 but not sse?

The kernel disables both in most cases; some drivers may use floating point if they explicitly call kernel_fpu_begin()/kernel_fpu_end() and are compiled with -msse re-added. For sse2, I guess the kernel is not guaranteed to be running on an x86_64 with sse2 (ie. could run on x86_64 with no sse2, so don't assume it's there for the purposes of code generation). Sounds like the x86_64 kernel requires sse but not sse2?

topperc commented 5 years ago

Why does the kernel specifically disable sse2 but not sse?

topperc commented 5 years ago

What does gcc do with -mno-sse2 -mno-80387?

llvm / llvm-project

unexpected failure for -mno-sse2 #42679

Extended Description