Vectorize c and c++ code

LazyKernel commented 2 years ago

Adding -ffast-math compilation option and moving the calculation of variable x inside the scope of the for-loop allows gcc to vectorize the calculation for C and C++. This results in a significant speedup on my machine, going from around 100ms to 50ms. The change does not significantly affect the accuracy of the result. Analyses below

C

#### Original ```json { "Started": "18-10-2022 19:33:20", "Ended": "18-10-2022 19:33:20", "Language": "C (gcc)", "Version": "11.2.1", "Command": "./leibniz", "Accuracy": 0.7222222222222222, "CalculatedPi": "3.1415926635893259", "Iterations": 3, "Average": "93.971662ms" } ``` #### Proposed ```json { "Started": "18-10-2022 19:56:48", "Ended": "18-10-2022 19:56:48", "Language": "C (gcc)", "Version": "11.2.1", "Command": "./leibniz", "Accuracy": 0.7222222222222222, "CalculatedPi": "3.1415926635857296", "Iterations": 3, "Average": "48.815929ms" } ```

C++

#### Original ```json { "Started": "18-10-2022 19:57:40", "Ended": "18-10-2022 19:57:41", "Language": "C++ (g++)", "Version": "11.2.1", "Command": "./leibniz", "Accuracy": 0.7222222222222222, "CalculatedPi": "3.1415926635893259", "Iterations": 3, "Average": "96.320308ms" } ``` #### Proposed ```json { "Started": "18-10-2022 19:59:43", "Ended": "18-10-2022 19:59:43", "Language": "C++ (g++)", "Version": "11.2.1", "Command": "./leibniz", "Accuracy": 0.7222222222222222, "CalculatedPi": "3.1415926635857296", "Iterations": 3, "Average": "49.448502ms" } ```

Moelf commented 2 years ago

Fast math is waaay to dangerous probably not how usually people compile their application

giordano commented 2 years ago

Mandatory reminder that in general -ffast-math is bad:

In principle you should be able to achieve the same effect with the less blunt -fassociative-math, but GCC doesn't seem to like this flag for me:

% gcc leibniz.c -o leibniz -O3 -s -static -flto -fassociative-math -march=native -mtune=native -fomit-frame-pointer
cc1: warning: ‘-fassociative-math’ disabled; other options take precedence
lto1: warning: ‘-fassociative-math’ disabled; other options take precedence
lto1: warning: ‘-fassociative-math’ disabled; other options take precedence

The flag is accepted by Clang instead, but doesn't seem to actually vectorise this code. The directive

#pragma clang fp reassociate(on)

would work without any additional command line flag in Clang, but I didn't find an equivalent pragma directive for GCC.

LazyKernel commented 2 years ago

Good point. I was not aware of the multitude of issues -ffast-math brings with it. Using just -fassociative-math seems like a much better option. For GCC though, we also need to enable -fno-signed-zeros and -fno-trapping-math (as per gcc manual). The combination -fassociative-math -fno-signed-zeros -fno-trapping-math produces practically the same result as above, with fewer downsides as far as I can see.

niklas-heer commented 2 years ago

@LazyKernel I had to update it due to #66

niklas-heer commented 2 years ago

Thank you very much for your contribution @LazyKernel 👍

niklas-heer / speed-comparison

Vectorize c and c++ code #61