Open b-jonas0 opened 4 years ago
I'm happy to accept a PR adding all of these. A good starting point would be here where all the other _MM_*
constants and helper functions are defined. You'll also need to update the documentation on _mm_setcsr
.
Doesn't LLVM assume that is works with the default floating point environment?
What is the current status? Couldn't find DAZ flag in the code.
On x86_64, the SSE floating point control register (MXCSR) has two bits concerned with denormal floating point values. The purpose of these bits is to enable a mode that avoids slowdowns from calculations with denormal numbers at the cost of getting incorrect results from underflows.
The more important one is the flush to zero bit (bit 15 in the register). When that bit is set, when a floating-point arithmetic instruction would output a denormal number (and would not raise an unmasked exception), then it instead outputs zero (and the denormal exception that it would normally flag is suppressed). The consts for this bit in the core::arch::x86_64 module are _MM_FLUSH_ZERO_ON, _MM_FLUSH_ZERO_OFF, _MM_FLUSH_ZERO_MASK.
The less important bit is the denormals are zero bit (bit 6 in the register). That bit affects the input arguments of floating-point arithmetic instructions, rather than the outputs. When the bit is set, when a floating-point arithmetic instruction has a number in a source argument that is a denormal number, the instruction behaves as if that number was zero instead. On x86_32, setting this bit is only conditionally supported, because old CPUs didn't have this mode. Testing and clearing the bit is always supported if the MXCSR register exists.
The crate does not have consts for the denormals are zero bit. This is probably an oversight, and this ticket asks to correct it. I suggest the following names, based on Intel's C interface, but I don't insist on them.
(Please double-check the above values before commiting.)
The C interface also has convenience macros for getting and setting the bit, so you may add those too. Personally I think they're superfluous, because functions that access this bit will most likely set or clear the bits together with the flush to zero bits, eg.