Closed LazyKernel closed 2 years ago
Fast math is waaay to dangerous probably not how usually people compile their application
Mandatory reminder that in general -ffast-math
is bad:
In principle you should be able to achieve the same effect with the less blunt -fassociative-math
, but GCC doesn't seem to like this flag for me:
% gcc leibniz.c -o leibniz -O3 -s -static -flto -fassociative-math -march=native -mtune=native -fomit-frame-pointer
cc1: warning: ‘-fassociative-math’ disabled; other options take precedence
lto1: warning: ‘-fassociative-math’ disabled; other options take precedence
lto1: warning: ‘-fassociative-math’ disabled; other options take precedence
The flag is accepted by Clang instead, but doesn't seem to actually vectorise this code. The directive
#pragma clang fp reassociate(on)
would work without any additional command line flag in Clang, but I didn't find an equivalent pragma directive for GCC.
Good point. I was not aware of the multitude of issues -ffast-math
brings with it. Using just -fassociative-math
seems like a much better option. For GCC though, we also need to enable -fno-signed-zeros
and -fno-trapping-math
(as per gcc manual). The combination -fassociative-math -fno-signed-zeros -fno-trapping-math
produces practically the same result as above, with fewer downsides as far as I can see.
@LazyKernel I had to update it due to #66
Thank you very much for your contribution @LazyKernel 👍
Adding
-ffast-math
compilation option and moving the calculation of variablex
inside the scope of the for-loop allows gcc to vectorize the calculation for C and C++. This results in a significant speedup on my machine, going from around 100ms to 50ms. The change does not significantly affect the accuracy of the result. Analyses belowC
#### Original ```json { "Started": "18-10-2022 19:33:20", "Ended": "18-10-2022 19:33:20", "Language": "C (gcc)", "Version": "11.2.1", "Command": "./leibniz", "Accuracy": 0.7222222222222222, "CalculatedPi": "3.1415926635893259", "Iterations": 3, "Average": "93.971662ms" } ``` #### Proposed ```json { "Started": "18-10-2022 19:56:48", "Ended": "18-10-2022 19:56:48", "Language": "C (gcc)", "Version": "11.2.1", "Command": "./leibniz", "Accuracy": 0.7222222222222222, "CalculatedPi": "3.1415926635857296", "Iterations": 3, "Average": "48.815929ms" } ```C++
#### Original ```json { "Started": "18-10-2022 19:57:40", "Ended": "18-10-2022 19:57:41", "Language": "C++ (g++)", "Version": "11.2.1", "Command": "./leibniz", "Accuracy": 0.7222222222222222, "CalculatedPi": "3.1415926635893259", "Iterations": 3, "Average": "96.320308ms" } ``` #### Proposed ```json { "Started": "18-10-2022 19:59:43", "Ended": "18-10-2022 19:59:43", "Language": "C++ (g++)", "Version": "11.2.1", "Command": "./leibniz", "Accuracy": 0.7222222222222222, "CalculatedPi": "3.1415926635857296", "Iterations": 3, "Average": "49.448502ms" } ```