cellarium-ai / cellarium-ml

Distributed single-cell data analysis.
BSD 3-Clause "New" or "Revised" License
11 stars 2 forks source link

NMF - implement consensus NMF #211

Open sjfleming opened 2 months ago

sjfleming commented 2 months ago

Paper here: https://elifesciences.org/articles/43803

But the idea is to be able to re-run NMF a bunch of times with different random seeds.

We can do this "all at once" by using D, A, and B matrices which are not just (k, k) etc., but (replicate, k, k) as a big tensor, so that we are updating all replicates at once. At the end, we have the results of replicates number of NMF runs. This comes with very very little computational overhead.

Then we can figure out what happens after and how to implement the clustering part.