KoslickiLab / L2-UniFrac

Expansion upon Median Unifrac to compute average phylogenetic distances between samples according to their UniFrac distance without producing negative abundance vectors.
1 stars 0 forks source link

Investigate more efficient SVD algorithms #18

Open dkoslicki opened 3 years ago

dkoslicki commented 3 years ago

The SVD step in the DPCoA algorithm will be one of the most computationally demanding. Try using more efficient approximations such as: sklearn.utils.extmath.randomized_svd,
the GPU flavor: https://scikit-cuda.readthedocs.io/en/latest/generated/skcuda.rlinalg.rsvd.html sklearn.decomposition.PCA with the 'arpack' svd_solver svdsklearn.decomposition.TruncatedSVD scipy.sparse.linalg.svds if the matrix is sparse sklearn.decomposition.IncrementalPCA on the pushed up vectors, etc.

Benchmark using a large dataset