privefl / bigsnpr

R package for the analysis of massive SNP arrays.
https://privefl.github.io/bigsnpr/
186 stars 44 forks source link

Can we analyse PRS from LDpred2 and PRS from (stacked) clumping and thresholding together? #351

Closed jianvhuang closed 2 years ago

jianvhuang commented 2 years ago

Hi Florian,

I wonder if we can analyse PRS from LDpred2 and PRS from clumping and thresholding (CT or SCT) together. More specifically, 1) I am constructing PRS for a cohort with three ethnic groups (Chinese, Malay, Indian). I constructed PRS for each ethnic group separately using both LDpred2 and CT methods. The correlation between PRS and the target trait was higher using LDpred2 in Chinese but they were higher using CT method in Malay and Indian. Sample size for each group was small (<200 Chinese, <100 Malay, <100 Indian). Therefore in the downstream association analysis for PRS -> health outcomes, I would like to analyse three ethnic groups together by adjusting for ethnicity in the linear model. I wonder if it makes sense to choose PRS from LDpred2 for Chinese and PRS from CT method for the other ethnic groups in this case.

2) On your tutorial webpage for SCT ([https://privefl.github.io/bigsnpr/articles/SCT.html]), you mentioned "Instead of stacking, an alternative is to choose the best C+T score based on the computed grid. This procedure is appealing when there are not enough individuals to learn the stacking weights." I don't think SCT is a good idea for a sample size of ~200. But if the sample size is large enough, is it possible or does it make sense to "stack" CT and LDpred2 PRS?

Thank you.

privefl commented 2 years ago
  1. Yes, you can just choose the best model. Just make sure to test the final model on individuals you have never used before.

  2. The sample size seems very small to stack more than just a few models.

jianvhuang commented 2 years ago

Thank you very much.