With the -freciprocal-math (and -funsafe-math-optimizations) flags the compiler can try harder to avoid dependent FSQRT and FDIV operations. For example
double res, res2, tmp;
void foo (double a, double b, int c, int d) {
tmp = 1.0 / __builtin_sqrt (a);
res = tmp * tmp;
if (d)
res2 = a * tmp;
}
With the
-freciprocal-math
(and-funsafe-math-optimizations
) flags the compiler can try harder to avoid dependent FSQRT and FDIV operations. For exampleWith
-Ofast
AArch64 LLVM generates:GCC at
-Ofast
can do:https://godbolt.org/z/717f54Teo
Notice how the expensive FSQRT and FDIV are now independent and can execute in parallel. A write-up of the transformation can be found in the GCC commit: http://gcc.gnu.org/g:24c49431499bcb462aeee41e027a3dac25e934b3