Open Quuxplusone opened 12 years ago
Attached jacobi_1d.DenormalsAreZero.c
(709 bytes, text/x-csrc): Test case
This is a neat trick. GCC links in crtfastmath.o (part of libgcc) which sets the necessary bits. We should do the same in the clang driver and provide crtfastmath.o with compiler-rt.
r165240 makes clang link crtfastmath.o if it's available (only on linux for now).
The pmmintrin.h header for SSE3 (included in clang) has the macro _MM_SET_DENORMALS_ZERO_MODE that sets DAZ. It doesn't require any SSE3-functionality to do so, only _mm_getcsr and _mm_setcsr, which are part of basic SSE. This might be an alternative when crtfastmath is not available (e.g. on Mac OS X).
Should FTZ also be set, e.g., -fno-signed-zero
is enabled (e.g. via -ffast-math
) ?
Depends on the target, but crtfastmath.o should set both DAZ and FTZ. E.g. x86-64.
NoSignedZeros is orthogonal to DAZ and FTZ. Except maybe for AMDGPU (I think, not sure) and other targets that have special flushing modes.
Also, Andy Kaylor just suggested an LLVM specific way to set these flags at runtime through compiler_rt.
I believe this should set the right ftz/daz flags since fa7cd549d604bfd8f9dce5d649a19720cbc39cca
jacobi_1d.DenormalsAreZero.c
(709 bytes, text/x-csrc)