Teichlab / cellphonedb

MIT License
340 stars 105 forks source link

Some details about the CellphoneDB #17

Closed fenghuijian closed 4 years ago

fenghuijian commented 5 years ago

Hi,all I have used the cellphonedb and I think it is a great tool! But I have some question about the cellphonedb. 1, How do you use the permutation test to calculate the significant intercellular interactions? 2, In the pvalue.txt , Why is interaction clusterA_clusterB different from interaction clusterB_clusterA? 3, In the interacting_pair of pvalue.txt, can I know which one is the receptor and which one is the receptor? And how can I distinguish the receptor – ligand? 4, Could you build a database of intercellular interactions in mice? Thanks!

mief commented 5 years ago

Hi, Thanks a lot for your interest in our method.

  1. For each interaction pair, for each pairwise comparison between two cell types, we permute the cluster labels of all cells 1000 times and each time we calculate the mean(mean(molecule 1 in cluster X), mean(molecule 2 in cluster Y)). This gives us a null distribution (of the mean when there are random clusters). We then check where our observed mean is on this null distribution.
  2. The p-value is different because if we have an interaction M1-M2, the clusterA_clusterB comparison checks for mean expression of M1 in clusterA and mean expression of M2 in cluster B, whereas in the clusterB_clusterA, this is opposite, we check for the mean expression of M1 in clusterB and so on.
  3. We didn't marked the interaction pairs as receptor and ligands because in addition to interactions of secreted ligands with membrane receptors, we also have interactions between two membrane receptors. However in the next release, we will make it more consistent at least for the secreted ones, and we will put the ligand always first.
  4. We are for now using the ortholog genes for our mouse dataset, perhaps in the future we will expand CellPhoneDB to mouse data as well.

Best, Mirjana

fenghuijian commented 5 years ago

Hi, Thank you very much for your reply!

I know the details of the cellphonedb basically. I have another question now. We used 10x genomic single-cell RNA datas, and the data-dim(gene:15382, cells:22825), When we uploaded the data to cellphonedb, it filtered a lot of genes, like 15382 genes -> ~1753 genes. It's useless to adjust the "Ratio of cells in a cluster expressing a gene" value. How to solve this problem? Thanks!

gmstanle commented 5 years ago

Are these permutations done on all cells, to create a background distribution of even expression in all cells? In other words, the significant p-value implies that both genes are coexpressed more than average, right? So an interaction with a nonsignificant p-value could still exist, but it is just not an "enriched" interaction?

mief commented 5 years ago

Yes, exactly, the permutations are done on all cells, to get the "enriched" interactions. An interaction with nonsignificant p-value could still exist, but we are only checking for interactions that are significantly specific between the two cell types. The idea is to infer a potential function of distinct subtypes, but looking into their enriched interactions.

aditisk commented 5 years ago

Hello everyone, firstly thank you to the authors for developing and sharing this method.

I am wondering if anyone has suggestions on what the easiest way to reproduce the dot plot in fig 5b would be ? I really love that format and would like to use it for my dataset.

Thanks.

mvento commented 4 years ago

Hi @aditisk,

Sorry, we only support dot_plot and heatmap plots: https://github.com/Teichlab/cellphonedb#plotting-statistical-method-results

Best

rsggsr commented 4 years ago

Hi @mief,

I found this post matching my requirements so I leave my issues here. Previously cellphonedb has done a awesome job for our ongoing human disease project. Right now we'd like to utilize in mouse sample but seems like it doesn't provide mouse database.

You mentioned above using mouse ortholog works pretty well (also in Issue #18). Does it mean just modify the gene name in the counts.txt? Could you specify more detailedly how to do it? Highly appreciated!

mief commented 4 years ago

Hi, Yes, what we have done with mouse data is to find the orthologs and replace the mouse genes with human orthologs. We have used biomart to obtain the orthologs. You can either use one-to-one orthologs and remove the rest of the genes that do not have orthologs or are duplicated, or one-to-many and pick one of the duplicated ones (for example the one with max expression). Best, Mirjana