getian107 / PRScsx

Cross-population polygenic prediction
MIT License
69 stars 20 forks source link

GWAS Sample Size #27

Closed jacklin9703 closed 1 year ago

jacklin9703 commented 1 year ago

Hi Tian,

Thank you for developing PRS-CS/PRS-CSx tools for PRS derivation! I'm applying these two tools for several GWAS summary statistics, but a problem with GWAS sample size estimation has occurred.

There is a parameter --n_gwas=GWAS_SAMPLE_SIZE required by PRS-CS/PRS-CSx. It's obvious that the effective sample size for continuous traits is the total sample size. While in binary trait situations, I wonder how to calculate the effective sample size for my PRS derivation. (I've found different equations from several sources...)

For example, my GWAS summary statistics data includes 2000 cases and 4000 controls. How should I calculate the GWAS sample size to fill in the --n_gwas parameter?

From LDpred2 tutorial (https://privefl.github.io/bigsnpr/articles/LDpred2.html), the author tells a formula for the effective sample size of binary trait GWAS, $N{eff} = 4 / (1 / N{case} + 1 / N_{control})$. I guess it might be available for PRS-CS/PRS-CSx?

Thanks!

getian107 commented 1 year ago

Hi - Yes you can calculate the effective sample size for a binary trait and use that as the input for --n_gwas.

jacklin9703 commented 1 year ago

I got it! Thanks a lot for your prompt and helpful response!