martinjzhang / scDRS

Single-cell disease relevance score (scDRS)
https://martinjzhang.github.io/scDRS/
MIT License
114 stars 16 forks source link

Interpreting scDRS score when comparing cell-type subclusters #86

Open HyeonbinJoHCho opened 6 months ago

HyeonbinJoHCho commented 6 months ago

Hi,

thank you for providing such a fantastic tool! It’s exactly what I needed for integrating GWAS summary data with scRNA-seq data.

While using this tool, I encountered a question regarding the interpretation of the result.

I have scRNA-seq data of normal cells from healthy controls and disease cells from patients. My focus is on epithelial cell clusters, so I have filtered them from the dataset.

To enhance power and reduce the bias of tissue origin, I used cells from both normal and disease tissues and added a covariate for "tissue" information. I found only one specific epithelial cluster reached statistical significance in scDRS statistics.

But contrary to our previous expectations, the cluster that was enriched turned out to be composed of normal cells rather than disease cells.

I understand that the scDRS score is very useful for prioritizing disease-related cell types among various types such as T cells, Myeloid cells, Neurons, and Hepatocytes, due to their distinct expression patterns. However, I am uncertain about the interpretation when comparing clusters within a specific cell type because they have similar gene expression and only a subset of genes will be differentially expressed. Moreover, it is possible that the expression of disease-related genes identified by GWAS could be down-regulated in the disease state.

I think variants found in GWAS will disturb gene expression and high-ranked genes from MAGMA will show lower expression in disease than expression in normal.

Could you help me understand how to interpret the results in this context? Any insights or recommendations would be greatly appreciated.

Thanks a lot for your help!

Best, Hyeonbin Jo

martinjzhang commented 6 months ago

Hi,

But contrary to our previous expectations, the cluster that was enriched turned out to be composed of normal cells rather than disease cells. I understand that the scDRS score is very useful for prioritizing disease-related cell types among various types such as T cells, Myeloid cells, Neurons, and Hepatocytes, due to their distinct expression patterns. However, I am uncertain about the interpretation when comparing clusters within a specific cell type because they have similar gene expression and only a subset of genes will be differentially expressed. Moreover, it is possible that the expression of disease-related genes identified by GWAS could be down-regulated in the disease state. I think variants found in GWAS will disturb gene expression and high-ranked genes from MAGMA will show lower expression in disease than expression in normal.

scDRS effectively detects subclusters of disease-associated cells within a given cell type, making your analysis appropriate.

Here's my interpretation of the results: The disease gene set includes genes that are crucial for the relevant pathways and gene functions of the disease. Maintaining normal expression levels of these genes is important for keeping an individual healthy. This is why the disease genes are specifically and highly expressed in a healthy epithelial cell population. The epithelial cells from the disease sample may be dysfunctional due to low expression of these genes, which is why scDRS does not detect them.

In our analysis of case-control datasets, scDRS may identify cells from both healthy and disease samples as relevant. Here are a few examples: