linbox-team / fflas-ffpack

FFLAS-FFPACK - Finite Field Linear Algebra Subroutines / Package
http://linbox-team.github.io/fflas-ffpack/
GNU Lesser General Public License v2.1
57 stars 23 forks source link

Performance regression with Modular<double> field #361

Closed ClementPernet closed 2 years ago

ClementPernet commented 2 years ago

benchmark-fgemm shows an approx 20% slowdown when using Modular<double> instead of ModularBalanced<double

With ModularBalanced<double>

pernet@nooksack:~/Logiciels/fflas-ffpack/benchmarks$ for ((n=1000;n<=3000;n+=1000)); do  ./benchmark-fgemm -w 0 -m $n -n $n -k $n; done
Time: 0.0438662 Gfops: 45.5932 -q 131071 -m 1000 -k 1000 -n 1000 -w 0 -i 3 -p 0 -t 1 -b 1
Time: 0.315978 Gfops: 50.6364 -q 131071 -m 2000 -k 2000 -n 2000 -w 0 -i 3 -p 0 -t 1 -b 1
Time: 1.03322 Gfops: 52.264 -q 131071 -m 3000 -k 3000 -n 3000 -w 0 -i 3 -p 0 -t 1 -b 1

With Modular<double>

pernet@nooksack:~/Logiciels/fflas-ffpack/benchmarks$ for ((n=1000;n<=3000;n+=1000)); do  ./benchmark-fgemm -w 0 -m $n -n $n -k $n; done
Time: 0.0601728 Gfops: 33.2376 -q 131071 -m 1000 -k 1000 -n 1000 -w 0 -i 3 -p 0 -t 1 -b 1
Time: 0.39317 Gfops: 40.6949 -q 131071 -m 2000 -k 2000 -n 2000 -w 0 -i 3 -p 0 -t 1 -b 1
Time: 1.21845 Gfops: 44.3187 -q 131071 -m 3000 -k 3000 -n 3000 -w 0 -i 3 -p 0 -t 1 -b 1

Surprisingly, there is no such discrepency between Modular<float> and ModularBalanced<float>.