Open t4c1 opened 5 years ago
Nice!! Can you post some speed and precision tests? I did a little one here that may be helpful tos how what I would like to see
https://gist.github.com/SteveBronder/912009ac53b676c89be030cf5dd62bd2#file-gpu_sensitivity_test-cpp
Sean also has some stuff here that could be useful
I made something similar to your test: https://gist.github.com/t4c1/b8112f4bb9b3d6261f29cb578fa20090
Once I open pull request for GPU part I will also add it to the test.
Could u please plot that table? I am bad in mentally plotting it...
...this looks quite nice... ok... looking closer there seems to be 2x speedup here. wonderful.
Here are the graphs for times and errors:
Nice! bstatcomp:cpu_eigendecomposition
is the branch I should be on to test the above?
Yep: https://github.com/bstatcomp/math/tree/cpu_eigendecomposition
There is also a variant for OpenCL where I think Tadej got around a 5-6x speedup compared to this branch on a GTX1070.
Description
Symmetric eigendecomposition from library Eigen is implemented in a suboptimal way. It consists of two steps: tridiagonalization and tridiagonal eigensolver.
In Eigen, basic Householder tridiagonalization is implemented. This can be improved on using blocked Householder algorithm, which uses cache more efficiently and is therefore faster.
For symmetric eigensolver Eigen uses QR iteration. While very accurate, it is quite slow. For square symmetric matrix of size n it does O(n^3) operations. It can be replaced by, for example, the MRRR algorithm that is significantly faster - O(n^2) and usually accurate enough. However, for cases where multiple eigenvalues are very close to each other - with relative difference (l1-l2)/l1 between them close to machine precision - MRRR might produce more significant errors or in extreme cases even fail to converge.
Both steps can be further improved using GPUs.
Example
Function is added that computes symmetric eigendecomposition.
Expected Output
Symmetric eigendecomposition is faster than one implemented in Eigen.
Current Version:
v2.18.1