Closed jianvhuang closed 2 years ago
Can you show me the plot?
See below three examples
The first one looks good, it is probably just you forgot the sd(y)
in the estimate of sd_ss
.
You can estimate it with e.g. with(df_beta, sqrt(quantile(0.5 * (n_eff * beta_se^2 + beta^2), 0.01)))
.
I used sd_ss = with(df_beta, 2 / sqrt(n_eff * beta_se^2))
to estimate sd_ss
, is this only for binary traits?
Since my GWAS traits are quantitative, I should use the below, right?
sd_y=with(df_beta, sqrt(quantile(0.5 * (n_eff * beta_se^2 + beta^2), 0.01)))
sd_ss = with(df_beta, sd_y / sqrt(n_eff * beta_se^2))
Yes, you should try that.
(Don't forget the additional + beta^2
, in case you have some large effects)
Thank you very much! I will try that.
So to be clear,
In case of large effects, I should add + beta^2
in the calculation for both sd_y and sd_ss, right?
sd_y=with(df_beta, sqrt(quantile(0.5 * (n_eff * beta_se^2 + beta^2), 0.01)))
sd_ss = with(df_beta, sd_y / sqrt(n_eff * beta_se^2+ beta^2))
And if I consider the effect is not large, I should remove beta^2
from both equations, right?
sd_y=with(df_beta, sqrt(quantile(0.5 * (n_eff * beta_se^2 ), 0.01)))
sd_ss = with(df_beta, sd_y / sqrt(n_eff * beta_se^2))
You should always use the beta^2, it does one approximation less, so it should be a better fit. But that makes a difference only when there are large effects, otherwise it is just negligible.
Thank you for clarifying.
Hi Florian,
I tried LDPRED2 on GWAS with small sample sizes (8000 to 35000), and the QC procedure based on the below script categories almost all of the SNPs as "bad". My target population also has a small sample size (<1000).
is_bad <- sd_ss < (0.5 * sd_val) | sd_ss > (sd_val + 0.1) | sd_ss < 0.1 | sd_val < 0.05
I wonder if you have any advice on this.
Thank you.