Closed Marius1311 closed 3 years ago
Are we really only concerned with the naming ..? general directions: (i) we can add "multi omics" but moscot(t) may by too similar to scot and muscat .. (ii) adding GW
This issue is about naming, yes. Re i, I agree, re ii, how would you add that?
Going through their preprint, I think they have a very nice summary of OT and GW-OT, potentially interesting for @michalk8 to get a quick overview. However, I think their actual code is a bit messy.
Methodologically, they include unbalanced GW-OT, which seems easy, they just change the Sinkhorn solver they use in the loop. Using unbalanced OT is important when you expect to see very different group (cluster) distribution across samples, see e.g. Fig. 10.1 in https://arxiv.org/abs/1803.00567
Also, they make an argument about scalability of their algorithm, see Fig. 1 below (taken from their preprint). Doesn't seem like it scales particularly well to me, takes ~half an hour on 5k cells (per metric space -> time point)
Their preprint doesn't actually consider the unbalanced case, they added this in a recent release (see Alg. 1 below from their preprint). The Algorithm is very basic GW-OT with a trick to rewrite the 4th order distance tensor for L2 distances from http://proceedings.mlr.press/v48/peyre16.pdf that NovoSparc uses as well I think.
Yes, novosparc uses it as well and indeed looks pretty simple. (ii) -> the problem is that `GW' is a tough one :( .. thinking ..
Actually, I think it would be good to treat this package similar to scVI-tools - it's the framework which defines the basic class structure, how we interact with AnnData objects and OTT (the backend). So we should give one name to the framework (something like XXX-tools) and then name individual models i.e. the lineage tracing model which we're working on etc.
So the framework shouldn't have GW in the name, our specific model can (but doesn't need to).
So the game is with- single cell (sc), multi omics (mo), optimal transport (ot) and ..?
tools, framework, toolkit, python
@zoepiran suggested moscot (= multi-omic single-cell optimal transport tools), which is a glasses brand, and I really like it! We could then use the glasses in our logo, I'm thinking of two piles of cells, one in each side of the glasses. What do you think @michalk8 ?
Like moscot
the best, here's what I came up with:
mascot
- multimodal single-cell optimal transport (variant of moscot
)moot
- multi-omics optimal transport (mb too close to mooc
)sicerot
(the t
is silent, pronounced as cicero
) - single-cell entropy-regularized optimal transport (mb too tricky)scrot
- single-cell regularized optimal transport (although this is a name of Unix utlity)muscovatto
- multi-omics single-cell optimal transport tools (play on muscovado, can't match v
, not obvious it's an ott framework)Thanks! BTW, best way to search for already existing tools is https://www.scrna-tools.org/table
I like moscot the best because I can imagine a logo related to it
closed via #5
@michalk8 caught this: https://rsinghlab.github.io/SCOT/
It's called SCOT and it does Gromov-Wasserstein for data integration. We need to change the name of this package (let's discuss here) and see what we can take from their implementation. Theirs is POT based.