Closed frucelee closed 1 year ago
Hi @frucelee
Thank you for the insightful question. It is statistically appropriate to run scDRS on scRNA-seq data with both case and control cells. We haven't systematically looked into what results to expect. Some preliminary results showed that the disease-relevant cells from high BMI people have slightly higher scDRS disease scores (P<0.05) than those from low BMI people. So running scDRS on disease cells may produce different discoveries. We haven't investigated the power either.
Super. Thanks a lot. In this case, do I need to put the disease condition (such as normal and disease ones) in the scRNA data as the covariable in the scDRS analysis. Thanks in advance.
And a last thing, during the scDRS compute-score analysis, we found that the different setting for --n_ctrl can produce distinct results. It seems that lower value of n_ctrl, more "significant" results would be obtained, such as set n_ctrl as 100 but not 1000 . How can we balance it between the parameters of "n_ctrl" and the results? For example, in our dataset, we have 5992 cells. Any suggestions? Thanks in advance.
Super. Thanks a lot. In this case, do I need to put the disease condition (such as normal and disease ones) in the scRNA data as the covariable in the scDRS analysis. Thanks in advance.
Hi @frucelee , if you believe that the disease condition tags strong technical covariates, then it would be good to include it as a covariate. The results will not be drastically different between w/ and w/o the covariate.
And a last thing, during the scDRS compute-score analysis, we found that the different setting for --n_ctrl can produce distinct results. It seems that lower value of n_ctrl, more "significant" results would be obtained, such as set n_ctrl as 100 but not 1000 . How can we balance it between the parameters of "n_ctrl" and the results? For example, in our dataset, we have 5992 cells. Any suggestions? Thanks in advance.
I wouldn't specify an n_ctrl
below 500. In our unpublished simulations, scDRS starts to have inflated false positives when n_ctrl
is below 500.
Super. Thanks a lot.
Hi, thank you so much for the very nice software. I have a pretty simple question about scRNA data input for the software. We have scRNA data with illness and normal conditions, and we could identify the disease-related cell subpopulation using GWAS data. Is it appropriate for this software to infer these interesting subpopulations from scRNA data from normal conditions? It is well understood that using scRNA-seq data from illness condition makes distinguishing the impact of the tissue's original genetic background and the impact of the examined GWAS signals challenging. Or this software allow us to using the scRNA data from normal and disease condition together? How about the power? Thanks a lot. Best, Lee