Closed lavakin closed 1 year ago
Thank you so much, @lavakin! Even 30% increase in wonderful. We can also look into establishing the progress bar on the rcpp
side and also focus on reimplementing the parallelisation which should further speed up the sampling procedures.
Thanks, Actually, Eigen should do the parallelization for us (But we should definitely check if it actually does in the current setting ( Eigen::nbthreads()))
We should also make sure, that on intels it uses the mkl libraries (https://stackoverflow.com/questions/51656818/benchmarking-matrix-multiplication-performance-c-eigen-is-much-slower-than) and possibly find alternatives for ARM
For the permutations, we should definitely look into parallelization
Yes, that would be brilliant! Is there a way to swiftly add an mkl check etc to become more hardware aware?
Implemented FlatLineTest cpp functions using Eigen library. Now it looks much more clean and elegant. Didn't benchmark the speed yet, but might not be a very significant speed up, since most of the time is perhaps taken up by generating the permutations.
Will rewrite the other cpp functions to Eigen soon.
Looked into generators included in std and made the permutation functions use linear congruential generator, instead of mersenne as a generator. Now the FlatLineTest is >30% faster.
Also replaced mersenne with mersenne 64, which is way faster (still slower than lcg) in case we decide on using mersenne at the end.