Open nickdesaulniers opened 5 years ago
Why does the kernel specifically disable sse2 but not sse?
The kernel disables both in most cases; some drivers may use floating point if they explicitly call kernel_fpu_begin()/kernel_fpu_end() and are compiled with -msse re-added. For sse2, I guess the kernel is not guaranteed to be running on an x86_64 with sse2 (ie. could run on x86_64 with no sse2, so don't assume it's there for the purposes of code generation). Sounds like the x86_64 kernel requires sse but not sse2?
I can't think of any x86-64 hardware that did not at least have sse2, it was present in all the early 64-bit microarchitectures (AMD Hammer/Opteron, Intel Prescott/Pentium4, Intel Bonnell/Atom, VIA Isiah). There are also several crypto algorithms in arch/x86/crypto/ that use sse2 (or later).
Generally speaking though, there should be no floating point operations in the kernel, the drivers that use MMX/SSE/SSE2 tend to just do that for vector integer math. The one exception that I know for this a part of the amdgpu driver that should probably get changed not to do so.
What does gcc do with -mno-sse2 -mno-80387? foo: subq $8, %rsp call __floatunsidf addq $8, %rsp ret
Make sense; assume no hardware support for xmm*, call into soft-fp routines.
Why does the kernel specifically disable sse2 but not sse?
The kernel disables both in most cases; some drivers may use floating point if they explicitly call kernel_fpu_begin()/kernel_fpu_end() and are compiled with -msse re-added. For sse2, I guess the kernel is not guaranteed to be running on an x86_64 with sse2 (ie. could run on x86_64 with no sse2, so don't assume it's there for the purposes of code generation). Sounds like the x86_64 kernel requires sse but not sse2?
Why does the kernel specifically disable sse2 but not sse?
What does gcc do with -mno-sse2 -mno-80387?
Extended Description
Consider:
$ cat foo.c // double __floatunsidf (unsigned int i) double foo(unsigned int i) { return (double) i; }
$ clang -O2 -mno-sse2 foo.c error: SSE2 register return with SSE2 disabled return (double) i;
$ gcc -O2 -mno-sse2 foo.c && llvm-objdump -d foo.o foo: movl %edi, %edi movq %rdi, -8(%rsp) fildq -8(%rsp) fstpl -8(%rsp) movlps -8(%rsp), %xmm0 ret
Adding -mno-80387 to clang seems to then complicate things further: clang generates:
foo: # @foo pushq %rax callq __floatunsidf popq %rcx retq
This is important because: