stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
176 stars 45 forks source link

Implement susieR for prior with mixture of two normals #51

Open pcarbo opened 5 years ago

pcarbo commented 5 years ago

Currently, susieR assumes a spike-and-slab prior on the regression coefficients. @stephens999 suggested that susieR could be easily extended to the more general case in which the prior is a mixture of two normals. This would be equivalent to the BSLMM model.

It turns out that this is fairly easy to implement. The code would look something like this:

K    <- tcrossprod(X)
n    <- nrow(K)
H    <- diag(n) + sb*K
L    <- t(tryCatch(chol(H),error = function(e) FALSE))
Xhat <- solve(L,X)
yhat <- solve(L,y)
if (is.matrix(L)) {
  # Run susieR with X = Xhat and y = yhat.
}

Here, sb is the prior variance (or it may be sb/p, where p is the number of SNPs---I would double-check the math) of the "small" or "background" polygenic effects (in this case, scaled by the residual variance). Your code may look slightly different depending on how you parameterize the prior on the regression coefficients.

This depends on having an estimate of sb, of course. You could either compute a point-estimate (e.g., MLE), or you could choose a range of values for sb, and run susieR for each different setting.

stephens999 commented 5 years ago

@FangBigSheep did you get a chance to try the code @pcarbo gave above?

FangBigSheep commented 5 years ago

Sorry I had a very busy week during the break, didn't try it yet. I will do soon.

FangBigSheep commented 5 years ago

I ran the code on some simple simulated data set, and tried many different values of sb (I have not come up with an intelligent way to estimate sb so far). It turns out that SuSiE has a great performance in most of the cases. And there is no big difference between this result and the original SuSiE result.

For now, I am trying to construct a case where the mixture normal Gaussian assumption has a substantial advantage, so that it may potentially "beat SuSiE".

pcarbo commented 5 years ago

@FangBigSheep For simulating data sets for your experiments, you might want to look at this R code, which essentially simulates data from the BSLMM model.

stephens999 commented 5 years ago

great - can you put a write-up in a workflowr report? (post to slack)

@pcarbo should be able to give a simple example where Susie on its own is less good.

On Thu, Apr 4, 2019 at 1:27 PM Zhengyang Fang notifications@github.com wrote:

I ran the code on some simple simulated data set, and tried many different values of sb (I have not come up with an intelligent way to estimate sb so far). It turns out that SuSiE has a great performance in most of the cases. And there is no big difference between this result and the original SuSiE result.

For now, I am trying to construct a case where the mixture normal Gaussian assumption has a substantial advantage, so that it may potentially "beat SuSiE".

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/issues/51#issuecomment-480011410, or mute the thread https://github.com/notifications/unsubscribe-auth/ABt4xdlzC-S-R4vZjZqfuG5O2AwlEnFfks5vdkQfgaJpZM4W7lcH .

pcarbo commented 5 years ago

@FangBigSheep And I can help you get set up with workflowr when we talk on Monday.

FangBigSheep commented 5 years ago

Great! I appreciate it.