Open ev-br opened 5 months ago
good question - maybe matrix size for now to keep it simple, although number of bands would also be interesting. GBMV is a curious case where the implementation is unchanged from the original GotoBLAS and may not have been evaluated for (multithreaded) performance at all.
OK, gbmv benchmarks now run for a standard range of sizes from 100 to 1000 and kl=1, 2, 3 (kl=ku, so it's 3, 5, and 7 bands). Web display is at http://www.openmathlib.org/BLAS-Benchmarks/#benchmarks.gbmv.time_gbmv and raw data is in the usual places from GH actions, https://github.com/OpenMathLib/BLAS-Benchmarks/actions/runs/9693385959/job/26748720100 for the graviton run etc.
Currently we benchmark:
BLAS level 1
BLAS level 3
LAPACK:
linalg.solve
)linalg.svd
)linalg.eigh
)@martin-frbg suggested it'd be useful to add
for the banded matrix-vector multiply, what is more interesting to scale: matrix size or the number of bands?