martinjzhang / scDRS

Single-cell disease relevance score (scDRS)
https://martinjzhang.github.io/scDRS/
MIT License
105 stars 13 forks source link

concerning beta or OR when choosing putative disease genes #42

Open parkjooyoung99 opened 1 year ago

parkjooyoung99 commented 1 year ago

Hello, while I was customizing my putative disease genes, I thought we might need to concern about 'beta value' or 'odds ratio'.

As your manuscript suggested, putative disease genes are expected to have higher expression levels in diseased-cell population. However, if we see the issue 2 (https://github.com/martinjzhang/scDRS/issues/2), we only use P-value to get 1000 MAGMA genes. If low P-value indicates positive Beta value, it would be fine to assume 1000 MAGMA genes are highly expressed in diseased ones. However, I assume P-value and Beta value are two independent values, which means 'low p-val = positive beta' is not the case. odds ratio > 1 can be used for above.

Therefore, I think it would be better to use only SNPs with positive beta value or OR > 1 to get 1000 MAGMA genes.

It would so nice to hear your thoughts. Thank you!

martinjzhang commented 1 year ago

Hi,

Thank you for the thoughtful comments. scDRS doesn't assume the MAGMA genes are highly expressed in disease cells. Instead, it assumes the MAGMA genes are relevant to disease, and disease-relevant genes are highly expressed in the disease-relevant cell population (can be either health or disease cells that are relevant to disease), regardless of the direction of these genes on disease. Take autoimmune disease (AID) and T cells as an example. scDRS will link AID to T cells because autoimmune genes are highly expressed in T cells. However, these genes form a complicated network; increasing expression levels may exacerbate the disease condition for some genes but alleviate the condition for other genes. So it's not clear if the disease-relevant genes will all have higher expression levels in the disease cells. It is an open question of how to incorporate the directionality of genes to provide deep biological insights. However, we currently don't have the bandwidth to investigate it further.

Therefore, I think it would be better to use only SNPs with positive beta value or OR > 1 to get 1000 MAGMA genes.

Thank you for the suggestion. However, I don't think it will work because GWAS data alone can not give the direction of gene-to-trait effects; positive betas of SNPs do not imply positive betas of the gene, as we don't know if the SNPs also have positive betas on the gene (eQTL effects). MAGMA only tests for relevance but not directionality.

Please let us know if you have further questions!

Best, Martin