emgthomas / moretrees_pkg

Package to fit ssMOReTreeS with various likelihoods and spike and slab prior
Other
0 stars 0 forks source link

ELBO for normal model sometimes decreases #1

Closed emgthomas closed 4 years ago

emgthomas commented 4 years ago

I have started this package by trying to fit a Bayesian linear model using a spike and slab prior to select groups of variables in/out of the model. To move from here to the multi-outcome model I'm actually interested is fairly simple, so I'm trying to get this basic variable selection model to work first.

I'm using variational inference to fit the model: the model and algorithm are described here. In case you aren't familiar with variational inference (and if you are, it probably helps for me to define some terms anyway!), the basic principle is that we choose a family of distributions that is easy to work with, and then try to find the member of that family that gives the closest approximation to the true posterior. In practice this becomes an optimization problem where we try to minimize the "distance" (usually, the KL divergence) between the true posterior and our approximating distribution. This turns out to be equivalent to maximizing a function known as the evidence lower bound (ELBO). This optimization problem is typically solved using co-ordinate ascent, sometimes known as CAVI (co-ordinate ascent variational inference).

The difficulty I am having is that I have developed a co-ordinate ascent algorithm (see the link above), but the ELBO doesn't always increase... which obviously shouldn't happen!

To give you some context around the code, here's an overview of the most important files. I've tried to comment each of these files so that you have some idea of what various chunks of code are doing.

  1. R/update_vi_params_normal.R updates the parameters of the variational distribution (our approximation to the true posterior). This is the core of the CAVI algorithm.
  2. R/update_hyperparams_normal.R computes the current value of the ELBO (and updates the values of the hyperparameters using approximate empirical Bayes, but for now we can ignore this part as I've turned off hyperparameter selection in the test code and assumed the hyperparameter values are fixed)
  3. R/spike_and_slab_normal.R puts 1 & 2 from above together and runs the the algorithm until the ELBO appears to converge.
  4. tests/gaussian_ss_test.R contains code to reproduce the problem I'm having. This file simulates a simple dataset from the model and runs the algorithm. If you run lines 1 through 52, you should end up with a plot that shows that the ELBO sometimes decreases with additional iterations of the algorithm. Weirdly, if you run lines 56 through 59, you can see that the algorithm actually appears to do a good job of returning the true parameter values. This makes me think that the problem is not with the algorithm, but with how I am computing the objective function. However, I've checked the code and derivation for computing the ELBO many times and I can't find the problem!

I hope this is enough context to get you started! Please let me know if you have any questions, need me to generate more reproducible examples, or whatever works for you :)

Thanks so much!!

-Emma

emgthomas commented 4 years ago

Reposting from email for completeness:

I spent some time today fixing some errors in the VI algorithm on the master branch. After doing that I noticed a couple of other bugs and corrected those. The situation is now much improved — the ELBO nearly always increases and the estimates still match the true parameter values closely. I noticed that for some simulated datasets, we still occasionally get a decreasing ELBO, as in the example if you now run the gaussian_ss_test.R file. However, these appear to be very tiny decreases that only happen occasionally, so I strongly suspect this is some kind of numerical error. If you would like to take a look and let me know what you think, that would be greatly appreciated!