scverse / squidpy

Spatial Single Cell Analysis in Python
https://squidpy.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
433 stars 79 forks source link

Nhood composition comparison across conditions? #483

Open annamath opened 2 years ago

annamath commented 2 years ago

Hello!

I work with CODEX data across different conditions and I was wondering whether there is a way to visualise/summarise the differences in neighbourhood composition across the different conditions, after running squidpy.gr.nhood_enrichment() ?

Thanks!

giovp commented 2 years ago

hi @annamath , thanks for your interest in squidpy!

this is a very interesting question and we should provide at least an example on how that could be done. I'l keep this open for reminder and hopefully we'll be able to address it at some point.

thank you again!

annamath commented 2 years ago

Hey @giovp , thanks for your answer!

My main concern is that across samples the range of the z score is not comparable. Quick update from my side, I tried 2 different approaches.

First, I extracted the raw (?) counts from the neighborhood analysis using the code below:

array = _get_data(obj, cluster_key='cell_type', func_name="nhood_enrichment")['count']
ad = AnnData(X=array, obs={str: pd.Categorical(obj.obs["cell_type"].cat.categories)})

And then instead of using the zscore I created a ratio between each neighborhood count devided by the maximum neighborhood value of each sample (which happens to be the tumour cells nhood in all replicates) and multiplied with 100. And this is the output (x axis the different nhoods and y axis the log2 value of the ratio, different colors are the different conditions) Screenshot 2022-03-24 at 16 33 43

On the other hand, I simply extracted the z scores instead and plotted them the same way (n11 is the neighborhood zscore of the tumour cells amongst themselves, it is missing from above since it would be 100). Screenshot 2022-03-24 at 16 33 32

I have a feeling that the first approach might be a way to go, but I wanted to hear your thoughts :) Problem is, that I don't see the same trends in the 2 different approaches..

Thanks again!

giovp commented 2 years ago

hi @annamath sorry for late reply.

My main concern is that across samples the range of the z score is not comparable. Quick update from my side, I tried 2 different approaches.

yeah I believe it makes sense, since the graph and thus the permutations will be different. I think one approach to look at them is to rank the results and then aggregate the ranks. Would need to think more about this though.

array = _get_data(obj, cluster_key='cell_type', func_name="nhood_enrichment")['count'] ad = AnnData(X=array, obs={str: pd.Categorical(obj.obs["cell_type"].cat.categories)})

sorry what's the purpose of this code? here it seems like you are extracting the counts (n_interactions) of the nhood enrichment method and assign them to an anndata.

And then instead of using the zscore I created a ratio between each neighborhood count devided by the maximum neighborhood value of each sample (which happens to be the tumour cells nhood in all replicates) and multiplied with 100. And this is the output (x axis the different nhoods and y axis the log2 value of the ratio, different colors are the different conditions)

I think just using the counts is not enough, as you want to test against permutation, so a z-score is a better score for this purpose.

On the other hand, I simply extracted the z scores instead and plotted them the same way (n11 is the neighborhood zscore of the tumour cells amongst themselves, it is missing from above since it would be 100).

I think looking at the z-score is better imho as you have a consistent way to compare the null distribution (based on permutations). I undertsnad the values are not the same but you could again either rank them or max-min scale them and they'd be on the same axis.

let me know what you think, curious to hear what you ended up doing. I'll think about this a bit more as well since it would be a useful modality to test. Just for me to undertand, what are the different colors here>?

YuanningEric commented 2 years ago

Hi @annamath, were you able to find a way to compare the z scores across conditions? I am also thinking about the similar question. Please let me know if you have any solutions.

annamath commented 2 years ago

Hey @YuanningEric,

for now what I did is MinMax of the nhood zscore per sample. I look at lymph node where B cells should always have the higher neighborhood score (and they do) so 1 for me is always the reference point let's say. It works well so far but I am not sure whether this approach is generalisable..