Closed jhonnold closed 1 year ago
Bench: 5369258
Chunking loop code results in GCC properly auto vectorizing code. By taking advantage of this, I can simplify away a significant amount of code within Berserk with no functional change.
https://godbolt.org/z/Gz9sqzfqh
Not SPRT tested.
PS C:\Programming\EngineTests> python .\speedup.py berserk.exe | master.exe | mu sigma | mu sigma | Sp(1)/Sp(2) 3*sigma ------------------------------------+------------------------------------+------------------------------------ 2886697.000 0.000| 2839374.000 0.000| 1.667 % +/- 0.000 % 2898431.500 16595.089| 2850725.500 16053.445| 1.673 % +/- 0.029 % 2902343.000 13549.833| 2862199.000 22886.244| 1.405 % +/- 1.397 % 2904298.750 11734.500| 2867935.750 21927.790| 1.270 % +/- 1.397 % 2905472.200 10495.656| 2871377.800 20490.474| 1.190 % +/- 1.325 % 2902084.500 12529.409| 2873672.500 19169.807| 0.991 % +/- 1.878 % 2903239.000 11838.574| 2875311.571 18028.879| 0.974 % +/- 1.720 % 2900977.375 12690.537| 2873657.250 17334.946| 0.953 % +/- 1.603 % 2901998.333 12259.679| 2875106.111 16787.811| 0.937 % +/- 1.506 % 2900468.200 12530.476| 2873803.200 16355.178| 0.929 % +/- 1.421 % 2899216.273 12591.747| 2874975.364 15995.509| 0.845 % +/- 1.589 % 2898173.000 12537.914| 2873900.500 15699.065| 0.846 % +/- 1.515 %
Bench: 5369258
Chunking loop code results in GCC properly auto vectorizing code. By taking advantage of this, I can simplify away a significant amount of code within Berserk with no functional change.
https://godbolt.org/z/Gz9sqzfqh
Not SPRT tested.