KoslickiLab / L2-UniFrac

Expansion upon Median Unifrac to compute average phylogenetic distances between samples according to their UniFrac distance without producing negative abundance vectors.
1 stars 0 forks source link

Implement DPCoA #12

Open dkoslicki opened 3 years ago

dkoslicki commented 3 years ago

Implement algorithm 2.3.1 from page 98 of Jason's thesis. Note that Jason is using a binary tree so the following adjustment will be necessary: Whenever you see anything like: \lfloor (i/2) \rfloor, interpret this as the ancestor of node i. Also, check in line 13 of algorithm 2.3.1 if the matrix is sparse (as sparse SVD approaches might help here). Lastly, the three nested for loops (lines 3-7, 8-12, and 14-18) can be parallelized over the depths of the tree. I.e. Given an ancestor node, the update step only depends on its descendants. Will need to check if that leads to a speedup. Otherwise, vectorization with numpy might make things faster.