theislab / ncem

Learning cell communication from spatial graphs of cells
https://ncem.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
103 stars 13 forks source link

Input to the "interaction linear model" #142

Closed qsl734 closed 1 year ago

qsl734 commented 1 year ago

Thank you for making ncem's codes public. My question is regarding the input to linear model with interactions. In the paper input xl is cell * unique cell type in real space. But from the tutorial data (MERFISH brain) I found xl is one-hot encoded, which is not REAL. I am wondering, why is the paper xl is REAL but in the tutorial it is not.

AnnaChristina commented 1 year ago

We formulated the methods section in our manuscript in the most general way. The interactions can indeed be $X^I \in \mathbb{R}^{N \times L}$, but let be be more precise:

  1. Generally it holds that $\mathbb{N} \subset \mathbb{R}$. In the ncem manuscript we tried to be as flexible as possible and in theory the framework would also allow passing non discrete cell types and replacing them with rather cell states or transitions between cell types or proportions/abundances of cell types in e.g. Visium spots. In this case, it would clearly hold that $X^L \in \mathbb{R}^{N \times L}$.

  2. For an un-scaled adjacency matrix one will have $A \in \mathbb{1}^{N \times N}$ and $X^L \in \mathbb{1}^{N \times L}$ which results in an interaction matrix of the form $X^I \in \mathbb{N}^{N \times L}$ and again $\mathbb{N} \subset \mathbb{R}$. But ncem also incorporates the option to leverage a scaled adjacency matrix of the form $A \in \mathbb{R}^{N \times N}$. In this case, one obtains an interaction matrix of the form $X^I \in \mathbb{R}^{N \times L}$

I hope this helps to understand the udnerlying concept and why the manuscript methods are phrased more generally.

AnnaChristina commented 1 year ago

I am closing this issue, but feel free to open it again in case of further questions on the underlying methods.