Closed fltt closed 7 years ago
Can you quickly remind me what will happen on CPUs which don't support that feature, and/or point out since when all CPUs do support that feature.
This came out when I tried to follow the instructions of the "Rumprun unikernel video series" part 1 -- when running the mpg123
unikernel on my 32 bit FreeBSD box I got an endless stream of "trap: #UD" on the screen.
Actually the problem is not in the CPU supporting that feature but in the OS supporting it. I don't know why, but to use the SSE instructions the OS must tell the CPU that it (the OS) has support for the FXSAVE and FXRSTOR instructions. From Intel's "Intel® 64 and IA-32 Architectures Software Developer's Manual -- Volume 3A: System Programming Guide, Part 1", section 2.5 "Control Registers":
OSFXSR Operating System Support for FXSAVE and FXRSTOR instructions (bit 9 of CR4) -- When set, this flag: (1) indicates to software that the operating system supports the use of the FXSAVE and FXRSTOR instructions, (2) enables the FXSAVE and FXRSTOR instructions to save and restore the contents of the XMM and MXCSR registers along with the contents of the x87 FPU and MMX registers, and (3) enables the processor to execute SSE/SSE2/SSE3/SSSE3/SSE4 instructions, with the exception of the PAUSE, PREFETCHh, SFENCE, LFENCE, MFENCE, MOVNTI, CLFLUSH, CRC32, and POPCNT. If this flag is clear, the FXSAVE and FXRSTOR instructions will save and restore the contents of the x87 FPU and MMX instructions, but they may not save and restore the contents of the XMM and MXCSR registers. Also, the processor will generate an invalid opcode exception (#UD) if it attempts to execute any SSE/SSE2/SSE3and instruction, with the exception of PAUSE, PREFETCHh, SFENCE, LFENCE, MFENCE, MOVNTI, CLFLUSH, CRC32, and POPCNT. The operating system or executive must explicitly set this flag.
I understand that the OS needs to enable the instructions -- I did explicitly change that on x86_64 precisely for mpg123 (*).
However, while we can assume that all x86_64 CPUs support MMX/SSE, the question was what happens on 32bit x86 for CPUs which do not support those instructions. If it crashes on boot for all applications, that's a regression in the OS. If it crashes when an application tries to use instructions not available on the current CPU, that's an application problem.
What if I check for SSE support before setting those flags?
/* check if the cpuid instruction is available */
pushfl
popl %eax
movl %eax, %ecx
xorl $0x200000, %eax
pushl %eax
popfl
pushfl
popl %eax
xorl %ecx, %eax
je no_sse
/* cpuid available, now test the sse feature flag */
movl $1, %eax
cpuid
test $0x2000000, %edx
jz no_sse
/* enable sse */
movl %cr4, %eax
orl $0x600, %eax
movl %eax, %cr4
no_sse:
That's what a canonical OS does. However, do we need to do it for either of the reasons I mentioned earlier?
Also, note that x86_cpuid() already uses cpuid unconditionally, so you might as well skip the "is cpuid supported" part. According to wikipedia, cpuid is supported even in later 486's, and there are probably other assumptions in the 32bit code which prevent from running on pre-Pentium.
But, since the code checking for SSE is so short, we could just do that and not think about it.
Bonus points if you want to define some of the magic numbers in headers ;-)
I put the CPUID check on purpose, to tease you. :-) Infact, if you were asking me if we should care about supporting CPUs older than the Pentium III (which was the first one to include the SSE instructions in 1999) I would have said -- is this proper english? -- that the chances to see the Rumprun kernel run on such relics is so near to zero that nobody could tell the difference.
However, for I don't known if all the models that followed the Pentium III do support the SSE instructions, I agree with you to add the SSE check (not the CPUID's) and stop thinking about it. And consider giving those magic numbers proper names done.
There might be some embedded x86 CPUs where the lack of SSE is relevant. Anyway, good plan.
I cherry-picked the awk fix. Seems like a useful thing to have regardless of when you get around to revising the SSE patch.
I've put the CPUID and CRx values and flags in a new file: arch/x86/flags.h Tell me if you prefer those defines to be put somewhere else.
reg.h
Also, I'd name x86_64 long mode enable something else than IA32_ENABLE
Finally, comment is out of date: s/cpuid available, now //
Thanks for the 0x20 -> ' ' ;-)
It's not IA32_ENABLE but IA32E_ENABLE: IA32E is Intel's name for long mode. I admit that some comment here and there would be handy. What if I add something like:
/* Extended Feature Enable Register MSR */
#define IA32_EFER 0xc0000080
/* Long Mode Enable */
#define IA32_EFER_IA32E_ENABLE_MASK 0x00000100
to the EFER defines and something similar to the other defines too?
OK for the other points.
Intel didn't invent long mode ;-)
But, anyway, I did ask you to name the constants, so maybe it's easier if we just go with the names you chose.
Generally, you seem to know which way to hold an editor (hope it's vi!). If you have future plans for Rumprun, I could give you push access, so we can avoid more naming discussions...
I didn't say Intel invented it -- they just have their own nomenclature.
At the moment I don't have future plan. I think the unikernel (whether Rumprun or others) has some place in the future of the cloud computing and as a software engineer worth his salt I cannot ignore it.
(sorry, I'm mostly an Emacs user but I don't dislike vi)
Ok, adjust your branch for the non-naming parts, and I'll merge it.
I changed my mind -- after all, I don't like Intel's nomenclature. I've also shortened the names.
Ok. Send another pull req if you change your mind again ;-)
Thank you!
This pull request fixes two unrelated issues. The first is a problem with awk: if the current locale defines as the decimal dot something different from a dot (e.g., a comma),
build-rr.sh
fails to parse gcc and ld's version numbers.The second issue is that at boot CR4's OSFXSR flag is left cleared, preventing the Rumprun unikernel from using the floating point routines contained in the compiler_rt library, when compiled for the 32 bit i386 architecture. I've also enabled the OSXMMEXCPT flag (to keep the boot code in sync with the 64 bit version).