Teichlab / cellphonedb

MIT License
339 stars 105 forks source link

Percentage of Expression #244

Open angelussong opened 3 years ago

angelussong commented 3 years ago

Hi,

Thanks for making this wonderful package! However, I have a question regarding the interpretation of the results.

So I followed the instruction and made a dot plot showing all the significant interactions between the two cell populations that I care about. But I am not sure if the significance is biologically meaningful. For example, I have a tumor population and a CD4 T-cell population and one of the significant interactions is PVR_TIGIT, which based on my understanding indicates an exhausted niche in the T-cell population. However, in my dataset, only 10% of the tumor cells are PVR+ and about 15%-20% are TIGIT+. I am not entirely sure how this interaction is generalizable for my dataset as only a low percentage of cells are expressing these two gene markers.

I'm wondering if you could enlighten me on what is the best way of interpreting this and if this low percentage but significant interaction could be interpreted as biologically relevant.

Thanks very much for your time and your help!

Pedramto89 commented 3 years ago

I have so some extent the same issue. My concern goes back to the legends. What do those legends refer to exactly? for example, log2 mean is equal to up and downregulation, and -log10 of 0-3 equals what levels of significancy? Thank you

luzgaral commented 3 years ago

Hi both,

We're glad you find CellphoneDB useful.

There are two measurements, the % of cells and the degree/average of expression. CellphoneDB p-values (and the significance) refer to the degree of expression, not the % of cells. Note that single-cell transcriptomics is sparse, characterized by drop-outs. To archive a good balance between false negatives/positives, we require that at least 10% of cells express all the interactors. From a biological point of view, having 10% of cells significantly overexpressing a ligand should be meaningful. You can modify this % threshold to be more stringent.

In the legends, P values are indicated by circle size; scale is shown below the plot. The means of the average expression level of interacting molecule 1 in cluster 1 and interacting molecule 2 in cluster 2 are indicated by color.

Hope this helps

Pedramto89 commented 3 years ago

Hello @luzgaral Thank you for the info. I still cannot figure out the legends in the dot plot. I attached a copy of what I got from CFDB. The p-values, which ones are significant? for example p-value with -log10 equal to 2, what does it mean? I guess but I want to make sure. Also, for the log2 mean? what do the colors mean? warmer and darker colors ranging from -2 to -6. what does it mean? 0001