gaow / mvarbvs

Multivariate Bayesian variable selection via variational inference
https://gaow.github.io/mvarbvs/
MIT License
1 stars 1 forks source link

mash, mvash, varbvs, BSLMM and, m2ash #2

Open gaow opened 8 years ago

gaow commented 8 years ago

What we have so far

gaow commented 8 years ago

One question is that do we need one single most general implementation, or we should still have multiple versions of essentially the same thing. I think it still worth having both. For instance mvash would become a special case for both varbvs and m2ash and we can check the correctness of implementation.

pcarbo commented 8 years ago

Yes, I think it is useful to have both m2ash and mvash.

BTW, I think it is useful to have this summary to remind ourselves how the existing models all relate to each other.

pcarbo commented 8 years ago

By the way, it's not clear to me that this approach, as it is described, could capture the Flutre et al (2013) model as a special case because of the introduction of a "common" effect ($\bar{b}$) for all tissues (see eqs. 8 & 9 on that paper) such that b ~ N(bbar, phi^2). I don't think it would be difficult to extend the m2ash model (+ variational inference) to allow for this.

gaow commented 8 years ago

If we let diag(U_k) = phi^2 + omega^2 and off-diagonal of Uk be omega^2, then we should have something similar to Flutre 2013, right? Though the weights are different because we'll learn pi{k,l} (K x L parameters) not pi_k and pi_k separately (K + L) parameters. But K x L parameters should make it more flexible.

pcarbo commented 8 years ago

Similar, but not the same; their model includes an additional regression coefficient that captures the "mean effect" across different tissues/conditions. Whether this is important or not is to be discussed I suppose (or perhaps you have already discussed this point?).

stephens999 commented 8 years ago

If you integrate out the shared mean effect you get the multivariate normal that Gao describes. I think this may be briefly mentioned in wen and Stephens. Matthew

On Aug 8, 2016 23:58, "Peter Carbonetto" notifications@github.com wrote:

Similar, but not the same; their model includes an additional regression coefficient that captures the "mean effect" across different tissues/conditions. Whether this is important or not is to be discussed I suppose (or perhaps you have already discussed this point?).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/gaow/mvarbvs/issues/2#issuecomment-238403182, or mute the thread https://github.com/notifications/unsubscribe-auth/ABt4xaUdBSa9-CciIjxcZ-lJNRV5NpqLks5qd7R3gaJpZM4JfPO0 .

stephens999 commented 8 years ago

There are multiple possible ways to implement variational inferences for any given model, so more details required to confirm that what Peter intends to implement is what Wei implemented. But I'm keen to see that happen.

I'm not sure whether it helps to say That Wei implanted what I think is referred to as a "variational em" algorithm?

Matthew

On Aug 9, 2016 19:59, "Matthew Stephens" stephens999@gmail.com wrote:

If you integrate out the shared mean effect you get the multivariate normal that Gao describes. I think this may be briefly mentioned in wen and Stephens. Matthew

On Aug 8, 2016 23:58, "Peter Carbonetto" notifications@github.com wrote:

Similar, but not the same; their model includes an additional regression coefficient that captures the "mean effect" across different tissues/conditions. Whether this is important or not is to be discussed I suppose (or perhaps you have already discussed this point?).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/gaow/mvarbvs/issues/2#issuecomment-238403182, or mute the thread https://github.com/notifications/unsubscribe-auth/ABt4xaUdBSa9-CciIjxcZ-lJNRV5NpqLks5qd7R3gaJpZM4JfPO0 .

pcarbo commented 8 years ago

Thanks, Matthew. That is helpful to know, especially as we consider the variational approximation. (Integrating over the shared mean effect would introduce correlations between all the beta's, so it wouldn't be a good idea to assume they are conditionally independent in the variational approximation.)

I use "variational EM" to mean an EM algorithm in which the usual E-step is replaced by an approximate E-step. I've developed similar options for the varbvs package since it is sometimes useful to be able to fit the hyperparameters to the data instead of averaging over them.

stephens999 commented 8 years ago

Another connected piece of work: Michael is working on an R package (bmass) implementing my PloS ONE 2012 methods for multivariate association analysis. I have talked with him about how this is essentially a special case of the more general mash framework, but he decided that he prefers to implement this special case, rather than working it into the general framework, as he understands it well, and he fears moving to the general case would slow things down too much for his graduation timeline. (My PloS ONE paper used a prior on the residual variance-covariance matrix, and integrated it out, whereas in mash we have switched to using Wakefield-like BF approximations, but when n is large the two are very similar, and the Wakefield approach works more generally).

To be clearer about what I mean by the general mash framework, I mean the model beta_j \sim \sum_l pi_l N_R(0,Sigma_l) betahat_j | beta_j \sim N_R(beta_j, V_j) where N_R denotes the R-variate normal. That is, the beta come from a mixture of multivariate normals and the betahats are noisy multivariate observations of the betas with known variance-covariance matrices V_j.

equation

\beta_i &\sim \sum_j \pi_j N_R(0,\Sigma_j) \\
\hat{\beta}_i \,|\, \beta_i &\sim N_R(\beta_i, V_i)
gaow commented 8 years ago

Sure. We should connect this work into the generalized implementation using this slightly different procedure. Now that we build this multivariate, multiple regression framework (working on it now) the inference procedure will be VB. For multivariate only (multiple Y, single X as in mash / bmass) the same VB procedure should also work as a special case but we will not be able to take advantage of convex optimization. Of course we can make that a special implementation for this special case.