stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
174 stars 44 forks source link

Compute posterior variance of effects #84

Closed omerwe closed 5 years ago

omerwe commented 5 years ago

Hi SuSiE,

I just started experimenting with SuSie and it looks very promising. I would like to ask about computing the posterior variance of effects. I couldn't find such functionality in the code. From my understanding of the math, the following code should do it. Can you please let me know if I got it right? (this assumes that the effects are conditionally independent given y and the posterior distribution parameters, hopefully this is right).

s = susie_bhat(bhat=bhat, shat=shat, R=R, n=n)
beta_mean = colSums(s$alpha * s$mu) /  s$X_column_scale_factors
beta2_l = s$alpha * s$mu2 / s$X_column_scale_factors**2
beta_l = s$alpha * s$mu / s$X_column_scale_factors
beta2_mean = colSums(beta2_l) + colSums(beta_l)**2 - colSums(beta_l * beta_l)
beta_var = beta2_mean - beta_mean**2

Thanks,

Omer

gaow commented 5 years ago

@omerwe what you did seems correct (computation of beta2_mean is analogous to equation B.37 of SuSiE paper), except for a minor issue in beta_mean scaling which edited directly on your post. I'll leave the ticket open for now, though, because this might be something we would like to add to our code. Thanks for raising the issue!

stephens999 commented 5 years ago

maybe we should add susie_get_posterior_mean and susie_get_posterior_sd

?

Currently we have coef to extract coefficients, which includes the intercept and behaves like glmnet (for example). And I guess this will be the main/encouraged interface for most users. But it seems for power users having posterior sd could well be useful, and then maybe it makes sense to have a matching function for posterior mean (and presumably no need to include intercept in either).

zouyuxin commented 5 years ago

One easier way to compute the variance is

beta_var = colSums(s$alpha * s$mu2 - (s$alpha*s$mu)^2)/(s$X_column_scale_factors^2)

This comes from where and are independent.

zouyuxin commented 5 years ago

Moreover, in the following simple example, the method @omerwe proposed gives negative variances, which may be caused by numerical errors.

example:

set.seed(1)
n = 100
p = 100
beta = rep(0,p)
beta[1:4] = 1
X = matrix(rnorm(n*p),nrow=n,ncol=p)
y = X %*% beta + rnorm(n)
s = susie(X,y,L=10)
omerwe commented 5 years ago

This is great, thanks!

stephens999 commented 5 years ago

@zouyuxin can you implement susie_get_posterior_mean and susie_get_posterior_sd based on these?

zouyuxin commented 5 years ago

sure, will do that.

mkanai commented 5 years ago

Hi, it seems there is a typo in susie_get_posterior_mean. Specifically, https://github.com/stephenslab/susieR/blob/7f7da47da40b5f15a02795c27a023414f9995e2a/R/susie_utils.R#L453

This line should be: (replace s$mu with res$mu)

colSums(res$alpha*res$mu)/res$X_column_scale_factors

Also, susie_get_posterior_mean and susie_get_posterior_sd are not currently exported. Do you have any plans to export them at some point?

Thanks!

gaow commented 5 years ago

@mkanai thank you for the feedback. We've fixed that typo and exported the functions. Please checkout the current master for version 0.8.1.0520.