sarah-ek / faer-rs

Linear algebra foundation for the Rust programming language
https://faer-rs.github.io
MIT License
1.82k stars 61 forks source link

Some Improvements for Benchmark Page #127

Open mert-kurttutan opened 5 months ago

mert-kurttutan commented 5 months ago

To make the benchmark more clear (and easier to reproduce (modulo hardware specs)), I think a few more details are needed. For instance, it is said that the benchmark are run with 12 threads. But, it is not fully clear how many threads are actually used. There are several factors that determines it (depending of the what kind of wrappers are used around BLAS implementation).

To give an example, # threads is managed by $OMP_NUM_THREADS with openmp parallelization enabled (or $MKL_NUM_THREADS and $OMP_NUM_THREADS for intel mkl. What I am saying is that it can be difficult to conclude how many threads are actually used.

It would be better to state number of threads explicitly with environment variables.

The results for singled threaded for other libraries would also be beneficial to include in the benchmark page.

The theoretical limit should also be included (in GFLOPS)

oscardssmith commented 5 months ago

also, I think it would be nice if the benchmarks were line graphs rather than tables.

mert-kurttutan commented 5 months ago

Another detail:

mert-kurttutan commented 5 months ago

also, I think it would be nice if the benchmarks were line graphs rather than tables.

I think this would be especially useful when you want to represent GFLOPS, which it should. Otherwise, time taken involves wildly different scales, making it a bit more difficult interpret