Help with PolyPred-S - Githubissues

jumentib commented 1 year ago

Dear Dr Weissbrod,

I want to use PolyPred to compute trans-etnics PRS. Since I don't have access to individual-level training data, I need to use PolyPred-S. As you explain in your paper, I have to use stat sums to estimate the "causal effect" with Polyfun and the "tagging effect" with SBayesR. And my question is: can we use the same sums stat for Polyfun and SBayesR?

I see in your tutorial (for PolyPred) that the dimensions between the sums stat used for Polyfun (bolt.sumstats.txt.gz, 80K SNPs) and the result of BOLT-LMM (bolt.betas.gz, 8K SNPs) are different, hence my question.

Sorry for this question, I'm new to the genetic field, and still a bit lost.

Thanks in advance,

Basile Jumentier

omerwe commented 1 year ago

@jumentib, the difference between the BOLT-LMM sumstats and causal effects are because we excluded all effects that are zero.

The short answer is that you can use any set of summary statistics you want for any method. As a rule of thumb, including more SNPs will lead to a better accuracy. The main gist of PolyFun is that you need to include all SNPs with MAF>0.1% to estimate true causal effects. For other methods (e.g. SBayesR) people usually use a smaller set of SNPs (around 1 million). I believe that all of this information is provided in the Methods section of either the PolyFun paper or the PolyPred paper (or both).

omerwe commented 1 year ago

@jumentib can I close the issue?

jumentib commented 1 year ago

Hello,

Yes, thank you very much for your answer !

omerwe / polyfun

Help with PolyPred-S #168