Open jcrdubois opened 9 years ago
The issue with NaN's being produced has not yet been resolved. I will be on the lookout for this bug.
Questions #1 and #2 were discussed over email with Julien Dubois.
(1) The priors can be very important, but it is very problem-dependent how important they are. My strategy is usually to try a range of different values for the prior, and see which ones yield the largest (normalized) weights. Note that if you download my most recent version of varbvs.m from github, I've added two new options to compute maximum likelihood estimates of the residual variance and prior variance of the regression coefficients. This should dramatically speed up the variational algorithm, and give you an initial sense as to what some of the priors should be.
(2) Julien found that the the coordinate ascent updates for computing the variational approximation converged much faster once y and the columns of X were centered to have a mean of zero (which is equivalent to including an intercept in the linear regresion).
I have been trying out your software for feature selection on a neuroimaging dataset. The input data is of dimensions 42 x 300 (eventually I want to work with more predictors, but I am currently testing the pipeline).
I adapted the code your provided in example1.m for my problem. The code I use is encapsulated in a function, as follows:
I have been experiencing the following error when running the mcmc method on my data, every now and then and not reproducibly:
It looks like in samplegamma's workspace, softmax(logr) is a vector of nan + nan*i, which seems wrong.
Please advise on what may be going wrong, let me know if you need more detail.
PS: two unrelated questions
1) I have not yet tried to change the priors, which may not be adapted to my problem. Do the priors matter much in your experience? I am basically working on a problem very similar to Smith et al, Functional Connectomics from Resting-State fMRI, TiCS (2013) Fig 7 [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4004765/].
2) it seems to me thus far that the variational inference is slower than the mcmc posterior distribution estimation. Of course I expect that this will change as the input data grows. Do you have a sense on when variational inference becomes faster than MCMC?
Thanks a lot!