Closed btracey closed 9 years ago
Pretty significant speedups. Thanks @fhs !
benchmark old ns/op new ns/op delta
BenchmarkDgemmSmSmSm 2347 1870 -20.32%
BenchmarkDgemmMedMedMed 1420777 601110 -57.69%
BenchmarkDgemmMedLgMed 14150749 5703823 -59.69%
BenchmarkDgemmLgLgLg 1413082561 588834401 -58.33%
BenchmarkDgemmLgSmLg 16212486 12189649 -24.81%
BenchmarkDgemmLgLgSm 16308323 12406908 -23.92%
BenchmarkDgemmHgHgSm 1690570016 1217346857 -27.99%
BenchmarkDgemmMedMedMedTNT 1400496 572611 -59.11%
BenchmarkDgemmMedMedMedNTT 1106778 654992 -40.82%
BenchmarkDgemmMedMedMedNTNT 1406017 592313 -57.87%
BenchmarkDgemvSmSmNoTransInc1 157 130 -17.20%
BenchmarkDgemvSmSmNoTransIncN 209 204 -2.39%
BenchmarkDgemvSmSmTransInc1 220 141 -35.91%
BenchmarkDgemvSmSmTransIncN 271 219 -19.19%
BenchmarkDgemvMedMedNoTransInc1 10003 5276 -47.26%
BenchmarkDgemvMedMedNoTransIncN 13430 11311 -15.78%
BenchmarkDgemvMedMedTransInc1 14554 5294 -63.63%
BenchmarkDgemvMedMedTransIncN 16948 12260 -27.66%
BenchmarkDgemvLgLgNoTransInc1 1044625 604480 -42.13%
BenchmarkDgemvLgLgNoTransIncN 1326452 1159622 -12.58%
BenchmarkDgemvLgLgTransInc1 1351233 611742 -54.73%
BenchmarkDgemvLgLgTransIncN 1693624 1197201 -29.31%
BenchmarkDgemvLgSmNoTransInc1 13017 10620 -18.41%
BenchmarkDgemvLgSmNoTransIncN 17490 18077 +3.36%
BenchmarkDgemvLgSmTransInc1 18241 10296 -43.56%
BenchmarkDgemvLgSmTransIncN 22960 18761 -18.29%
BenchmarkDgemvSmLgNoTransInc1 9231 4594 -50.23%
BenchmarkDgemvSmLgNoTransIncN 12685 10905 -14.03%
BenchmarkDgemvSmLgTransInc1 13172 4631 -64.84%
BenchmarkDgemvSmLgTransIncN 16498 11630 -29.51%
BenchmarkDgerSmSmInc1 174 125 -28.16%
BenchmarkDgerSmSmIncN 232 202 -12.93%
BenchmarkDgerMedMedInc1 13183 5443 -58.71%
BenchmarkDgerMedMedIncN 16141 12183 -24.52%
BenchmarkDgerLgLgInc1 1372862 783534 -42.93%
BenchmarkDgerLgLgIncN 1565488 1243930 -20.54%
BenchmarkDgerLgSmInc1 15179 10845 -28.55%
BenchmarkDgerLgSmIncN 21187 19274 -9.03%
BenchmarkDgerSmLgInc1 12449 3595 -71.12%
BenchmarkDgerSmLgIncN 14455 10906 -24.55%
Still a while to cgo, but we're getting there. These level 2 routines are probably parallel in cgo.
benchmark old ns/op new ns/op delta
BenchmarkDgemvSmSmNoTransInc1 130 301 +131.54%
BenchmarkDgemvSmSmNoTransIncN 204 309 +51.47%
BenchmarkDgemvSmSmTransInc1 141 298 +111.35%
BenchmarkDgemvSmSmTransIncN 219 316 +44.29%
BenchmarkDgemvMedMedNoTransInc1 5276 1844 -65.05%
BenchmarkDgemvMedMedNoTransIncN 11311 2536 -77.58%
BenchmarkDgemvMedMedTransInc1 5294 2515 -52.49%
BenchmarkDgemvMedMedTransIncN 12260 2844 -76.80%
BenchmarkDgemvLgLgNoTransInc1 604480 258565 -57.23%
BenchmarkDgemvLgLgNoTransIncN 1159622 264518 -77.19%
BenchmarkDgemvLgLgTransInc1 611742 280277 -54.18%
BenchmarkDgemvLgLgTransIncN 1197201 289379 -75.83%
BenchmarkDgemvLgSmNoTransInc1 10620 3304 -68.89%
BenchmarkDgemvLgSmNoTransIncN 18077 5370 -70.29%
BenchmarkDgemvLgSmTransInc1 10296 3358 -67.39%
BenchmarkDgemvLgSmTransIncN 18761 2728 -85.46%
BenchmarkDgemvSmLgNoTransInc1 4594 2110 -54.07%
BenchmarkDgemvSmLgNoTransIncN 10905 2281 -79.08%
BenchmarkDgemvSmLgTransInc1 4631 3666 -20.84%
BenchmarkDgemvSmLgTransIncN 11630 6218 -46.53%
BenchmarkDgerSmSmInc1 125 791 +532.80%
BenchmarkDgerSmSmIncN 202 798 +295.05%
BenchmarkDgerMedMedInc1 5443 2382 -56.24%
BenchmarkDgerMedMedIncN 12183 2570 -78.91%
BenchmarkDgerLgLgInc1 783534 615112 -21.50%
BenchmarkDgerLgLgIncN 1243930 623505 -49.88%
BenchmarkDgerLgSmInc1 10845 4477 -58.72%
BenchmarkDgerLgSmIncN 19274 4390 -77.22%
BenchmarkDgerSmLgInc1 3595 2278 -36.63%
BenchmarkDgerSmLgIncN 10906 3055 -71.99%
Ddot's getting really close though:
brendan:~/Documents/mygo/src/github.com/gonum/blas/native$ benchcmp new.bench cgo.bench
benchmark old ns/op new ns/op delta
BenchmarkDdotSmallBothUnitary 15.8 207 +1210.13%
BenchmarkDdotSmallIncUni 18.5 214 +1056.76%
BenchmarkDdotSmallUniInc 18.5 214 +1056.76%
BenchmarkDdotSmallBothInc 18.6 216 +1061.29%
BenchmarkDdotMediumBothUnitary 460 435 -5.43%
BenchmarkDdotMediumIncUni 1078 1054 -2.23%
BenchmarkDdotMediumUniInc 1096 866 -20.99%
BenchmarkDdotMediumBothInc 1084 1166 +7.56%
BenchmarkDdotLargeBothUnitary 49681 39232 -21.03%
BenchmarkDdotLargeIncUni 198573 170580 -14.10%
BenchmarkDdotLargeUniInc 118322 101916 -13.87%
BenchmarkDdotLargeBothInc 347034 274692 -20.85%
BenchmarkDdotHugeBothUnitary 11061677 10652330 -3.70%
BenchmarkDdotHugeIncUni 37096808 32448696 -12.53%
BenchmarkDdotHugeUniInc 24956236 20180868 -19.13%
BenchmarkDdotHugeBothInc 45096756 41569145 -7.82%
PTAL
Updated benchmarks:
brendan:~/Documents/mygo/src/github.com/gonum/blas/native$ benchcmp old.txt new.txt
benchmark old ns/op new ns/op delta
BenchmarkDgemmSmSmSm 2729 1901 -30.34%
BenchmarkDgemmMedMedMed 1454813 613739 -57.81%
BenchmarkDgemmMedLgMed 14976720 5682133 -62.06%
BenchmarkDgemmLgLgLg 1437288447 583485583 -59.40%
BenchmarkDgemmLgSmLg 23259519 12217995 -47.47%
BenchmarkDgemmLgLgSm 23417498 12636051 -46.04%
BenchmarkDgemmHgHgSm 2353436677 1205930033 -48.76%
BenchmarkDgemmMedMedMedTNT 1513161 599095 -60.41%
BenchmarkDgemmMedMedMedNTT 1284047 669217 -47.88%
BenchmarkDgemmMedMedMedNTNT 1445991 615434 -57.44%
BenchmarkDgemvSmSmNoTransInc1 206 141 -31.55%
BenchmarkDgemvSmSmNoTransIncN 247 206 -16.60%
BenchmarkDgemvSmSmTransInc1 226 143 -36.73%
BenchmarkDgemvSmSmTransIncN 279 230 -17.56%
BenchmarkDgemvMedMedNoTransInc1 10785 5446 -49.50%
BenchmarkDgemvMedMedNoTransIncN 13994 12008 -14.19%
BenchmarkDgemvMedMedTransInc1 14127 5444 -61.46%
BenchmarkDgemvMedMedTransIncN 15957 11829 -25.87%
BenchmarkDgemvLgLgNoTransInc1 1008614 557777 -44.70%
BenchmarkDgemvLgLgNoTransIncN 1292387 1097074 -15.11%
BenchmarkDgemvLgLgTransInc1 1257284 561283 -55.36%
BenchmarkDgemvLgLgTransIncN 1457361 1193900 -18.08%
BenchmarkDgemvLgSmNoTransInc1 16966 10974 -35.32%
BenchmarkDgemvLgSmNoTransIncN 22019 18493 -16.01%
BenchmarkDgemvLgSmTransInc1 18401 10811 -41.25%
BenchmarkDgemvLgSmTransIncN 23416 18270 -21.98%
BenchmarkDgemvSmLgNoTransInc1 9830 4777 -51.40%
BenchmarkDgemvSmLgNoTransIncN 12141 10670 -12.12%
BenchmarkDgemvSmLgTransInc1 13488 4620 -65.75%
BenchmarkDgemvSmLgTransIncN 14623 11574 -20.85%
BenchmarkDgerSmSmInc1 216 131 -39.35%
BenchmarkDgerSmSmIncN 257 206 -19.84%
BenchmarkDgerMedMedInc1 13255 5293 -60.07%
BenchmarkDgerMedMedIncN 15942 12235 -23.25%
BenchmarkDgerLgLgInc1 1276512 639479 -49.90%
BenchmarkDgerLgLgIncN 1464653 1194753 -18.43%
BenchmarkDgerLgSmInc1 18638 12299 -34.01%
BenchmarkDgerLgSmIncN 22120 18782 -15.09%
BenchmarkDgerSmLgInc1 11955 3835 -67.92%
BenchmarkDgerSmLgIncN 14437 10814 -25.10%
@jonlawlor this should bring the covmat benchmarks up
LGTM
These are the functions that are most likely to be used / are already used. Others can be updated as needed.