saezlab / liana-py

LIANA+: an all-in-one framework for cell-cell communication
http://liana-py.readthedocs.io/
GNU General Public License v3.0
134 stars 15 forks source link

Spatially informed bivariate metrics in tensor cell2cell and MOFA #123

Open Zach-Sten opened 1 week ago

Zach-Sten commented 1 week ago

Hi all,

I've been testing the use of cell2cell and MOFA within Liana and was wondering if there was a way to use the ligand-receptor information from the spatial bivariate vignette instead of rank_aggregate? Also, I noticed that there isn't an option to use the .by_sample for li.mt.bivariate.

dbdimitrov commented 1 week ago

Hi @Zach-Sten,

I've been testing the use of cell2cell and MOFA within Liana and was wondering if there was a way to use the ligand-receptor information from the spatial bivariate vignette instead of rank_aggregate?

Not sure I understand the question. The bivariate scores result in a M x D (obs x interactions) matrix so I'm not sure what would be your 3rd/4rd dimensions for MOFA/Tensor-cell2cell. So, since it's just 2D, you can easily use it with NMF or archetypal analysis, etc. Maybe you could clarify?

Also, I noticed that there isn't an option to use the .by_sample for li.mt.bivariate. Good point. TBH, I'm not sure how common it is to concatenate spatial AnnDatas, mainly due to challenges with .obsm / .obsp (i.e. where we define the spatial weights).

From a quick read, it seems like the easiest/suggested way (by Squidpy/AnnData devs) is to essentially calculate spatial proximities for each slide and then concatenate all of them, filling the graph/proximities with 0 across slides (https://github.com/scverse/squidpy/issues/318). This makes sense but it would again require a for loop to calculate those, and if one is already doing that then one can add the one additional line to also calculate the bivariate scores. So, I'm not sure that such a utility function on the liana+ side is currently worthwhile. It's a good point though, I need to think a bit more about this. If you have any tools in mind that provide such a utility I'd be happy to take a look.

Zach-Sten commented 6 days ago

Thanks for getting back to me so soon. We have used NMF for getting our factors but were interested if other methods such as tenosr cell2cell and MOFA would give different results.

As for the spatial proximities, we scan tissue microarrays where each core is treated as an individual sample. Therefore, instead of using squidpy's suggested method we separate out each core by clustering the spatial coordinates (since there is empty space between the cores they separate out nicely) and run the spatial proximities and bivariate scores in a loop as you suggest. I haven't come across any tools that do this automatically (especially separating out each core from a slide).

dbdimitrov commented 6 days ago

Hi @Zach-Sten,

We have used NMF for getting our factors but were interested if other methods such as tenosr cell2cell and MOFA would give different results.

Tensor-cell2cell uses a dimensionality reduction which is very similar to NMF, except it is extended to 4 dimensions (source cell, target cell, ligand-receptor interaction, and sample). MOFA+ is in short an extension of standard factor analysis to multi-views + some regularizations. So, for the NMF scores unless you have some additional dimensions (besides observations and interactions), you can try other standard matrix decomposition approaches meant to work for 2D, e.g. Factor analysis, PCA, or e.g. archetypal analysis (https://github.com/rockdeme/chrysalis).

As for the spatial proximities, we scan tissue microarrays where each core is treated as an individual sample. Therefore, instead of using squidpy's suggested method we separate out each core by clustering the spatial coordinates (since there is empty space between the cores they separate out nicely) and run the spatial proximities and bivariate scores in a loop as you suggest. I haven't come across any tools that do this automatically (especially separating out each core from a slide).

I see. Yeah, then I would assume NMF or any other standard factorization should also be able to capture the clusters.

Hope this helps.