rust-lang / stdarch

Rust's standard library vendor-specific APIs and run-time feature detection
https://doc.rust-lang.org/stable/core/arch/
Apache License 2.0
612 stars 271 forks source link

x86_64 MXCSR denormals are zero bit: add constant #852

Open b-jonas0 opened 4 years ago

b-jonas0 commented 4 years ago

On x86_64, the SSE floating point control register (MXCSR) has two bits concerned with denormal floating point values. The purpose of these bits is to enable a mode that avoids slowdowns from calculations with denormal numbers at the cost of getting incorrect results from underflows.

The more important one is the flush to zero bit (bit 15 in the register). When that bit is set, when a floating-point arithmetic instruction would output a denormal number (and would not raise an unmasked exception), then it instead outputs zero (and the denormal exception that it would normally flag is suppressed). The consts for this bit in the core::arch::x86_64 module are _MM_FLUSH_ZERO_ON, _MM_FLUSH_ZERO_OFF, _MM_FLUSH_ZERO_MASK.

The less important bit is the denormals are zero bit (bit 6 in the register). That bit affects the input arguments of floating-point arithmetic instructions, rather than the outputs. When the bit is set, when a floating-point arithmetic instruction has a number in a source argument that is a denormal number, the instruction behaves as if that number was zero instead. On x86_32, setting this bit is only conditionally supported, because old CPUs didn't have this mode. Testing and clearing the bit is always supported if the MXCSR register exists.

The crate does not have consts for the denormals are zero bit. This is probably an oversight, and this ticket asks to correct it. I suggest the following names, based on Intel's C interface, but I don't insist on them.

pub const _MM_DENORMALS_ZERO_MASK: u32 = 0x0040;
pub const _MM_DENORMALS_ZERO_ON: u32 = 0x0040;
pub const _MM_DENORMALS_ZERO_OFF: u32 = 0x0000;

(Please double-check the above values before commiting.)

The C interface also has convenience macros for getting and setting the bit, so you may add those too. Personally I think they're superfluous, because functions that access this bit will most likely set or clear the bits together with the flush to zero bits, eg.

_mm_setcsr(_mm_getcsr() | _MM_FLUSH_ZERO_ON | _MM_DENORMALS_ZERO_ON);
// XMM floating-point arithmetic computations here
_mm_setcsr(_mm_getcsr() & !_MM_FLUSH_ZERO_MASK & !_MM_DENORMALS_ZERO_MASK);
Amanieu commented 4 years ago

I'm happy to accept a PR adding all of these. A good starting point would be here where all the other _MM_* constants and helper functions are defined. You'll also need to update the documentation on _mm_setcsr.

bjorn3 commented 4 years ago

Doesn't LLVM assume that is works with the default floating point environment?

matthiascy commented 2 years ago

What is the current status? Couldn't find DAZ flag in the code.

Amanieu commented 2 years ago

The current status is still:

I'm happy to accept a PR adding all of these. A good starting point would be here where all the other _MM_* constants and helper functions are defined. You'll also need to update the documentation on _mm_setcsr.