aras-p / test_math_vec_debug_perf

3 stars 0 forks source link

Code for this blog post: https://aras-p.info/blog/2024/09/14/Vector-math-library-codegen-in-Debug/

Results

All of this on a PC with Ryzen 5950X CPU, Windows 10 and Linux via WSL. Times in milliseconds, lower is better.

"Debug" build

Codegen Unroll+Asserts Loop+Asserts Xyzw+Asserts Raw C SIMD
Clang Linux 14 O0 6476 4858 1353 274 2298
Clang Linux 15 O0 7939 5013 1441 286 2392
Clang Win 17 O0 4965 4195 788 262 1821
Gcc Linux 11 O0 5942 4056 1467 265 1418
Gcc Linux 12 O0 5610 4068 1511 261 1409
VS2022 (v17.11.0) /Od 7646 4339 1281 285 1603
VS2022 (v17.11.0) /Od /JMC 17560 8446 3662 316 4657
Clang Linux 14 Og 74 875 75 68 53
Clang Linux 15 Og 73 884 76 67 53
Clang Win 17 Og 74 352 68 68 53
Gcc Linux 11 Og 964 1009 181 112 104
Gcc Linux 12 Og 3528 2141 354 112 130

"Developer" build (optimizations on, asserts on)

Codegen Unroll+Asserts Loop+Asserts Xyzw+Asserts Raw C SIMD
Clang Linux 14 O2 45 45 45 20 20
Clang Linux 15 O2 45 45 45 20 20
Clang Win 17 O2 42 41 42 21 20
Gcc Linux 11 O2 71 704 72 69 54
Gcc Linux 12 O2 71 235 69 72 52
VS2022 (v17.11.0) /O2 591 82 80 82 52

"Release" build

Codegen Unroll Loop Xyzw Raw C SIMD
Clang Linux 14 O2 45 45 45 20 20
Clang Linux 15 O2 45 45 45 20 20
Clang Win 17 O2 41 41 41 21 20
Gcc Linux 11 O2 70 703 71 69 54
Gcc Linux 12 O2 70 230 69 72 52
VS2022 (v17.11.0) /O2 67 65 81 82 52
Clang Linux 14 O3 23 23 23 20 20
Clang Linux 15 O3 23 23 23 20 20
Clang Win 17 O3 45 45 44 20 20
Gcc Linux 11 O3 69 68 68 69 51
Gcc Linux 12 O3 71 69 69 69 51