Expansion upon Median Unifrac to compute average phylogenetic distances between samples according to their UniFrac distance without producing negative abundance vectors.
The SVD step in the DPCoA algorithm will be one of the most computationally demanding. Try using more efficient approximations such as:
sklearn.utils.extmath.randomized_svd,
the GPU flavor: https://scikit-cuda.readthedocs.io/en/latest/generated/skcuda.rlinalg.rsvd.html
sklearn.decomposition.PCA with the 'arpack' svd_solver
svdsklearn.decomposition.TruncatedSVD
scipy.sparse.linalg.svds if the matrix is sparse
sklearn.decomposition.IncrementalPCA on the pushed up vectors, etc.
The SVD step in the DPCoA algorithm will be one of the most computationally demanding. Try using more efficient approximations such as: sklearn.utils.extmath.randomized_svd,
the GPU flavor: https://scikit-cuda.readthedocs.io/en/latest/generated/skcuda.rlinalg.rsvd.html sklearn.decomposition.PCA with the 'arpack' svd_solver svdsklearn.decomposition.TruncatedSVD scipy.sparse.linalg.svds if the matrix is sparse sklearn.decomposition.IncrementalPCA on the pushed up vectors, etc.
Benchmark using a large dataset