Closed dustinvtran closed 8 years ago
Ultimately to specify the variational model, we require a language that enables arbitrary stacking of distributions on (parameters of) other distributions. In other words, we need a language for specifying hierarchical models.
There were several suggestions from the meeting today:
sample()
and log_prob()
. Then have wrappers that can collect the distributions in order to easily specify a mean-field Gaussian of dimension d
.The former would be preferable, but in the end we want the variational model to be written in TensorFlow (specifically, its parameters), so that we can do autodiff and also use TensorFlow's speed. So I think the latter is the right direction.
@mariru mentioned something about storing just one really large tf.Variable()
. Then when variational families use it for various things (sampling, log prob evaluation), the variational family will 1. extract the necessary parameters; 2. call the method with those parameters. For example, a Gaussian variational family will store the mean parameters as the first half and std dev parameters as the second half.
This is useful for factorizations which have a combination of families (e.g., a mean-field family of a Gaussian mixture model has mean-field Gaussian, Dirichlet, and inverse Gamma). It's also useful for variational auto-encoders: the output of the neural network is a vector that we extract variational parameters from.
One idea would be to have the user specify their variational family using a list of triplets. [("name1", size1, type1), ("name2", size2, type2), ...]
The variational class then creates the tensorflow object for the variational parameters z = tf.Variable() of length flatten(size1)flatten(size2)... and uses the types to choose which transformations apply to each component of these parameters. The number of variational parameters will also depend on the types because a Gaussian latent variable has for example 2 variational parameters.
Sampling from the variational object should then return a dictionary of the latent variables in the correct shape: def sample(self,size,sess): z1 = ... z2 = ... return {"name1": z1.reshape(size1), "name2": z2.reshape(size2), ...}
Note that we want this returned dictionary to be created automatically depending on the list the user provided to instantiate the variational class.
This abstraction can help with the following two things:
On Mon, Mar 14, 2016 at 2:58 PM Dustin Tran notifications@github.com wrote:
@mariru https://github.com/mariru mentioned something about storing just one really large tf.Variable(). Then when variational families use it for various things (sampling, log prob evaluation), the variational family will 1. extract the necessary parameters; 2. call the method with those parameters. For example, a Gaussian variational family will store the mean parameters as the first half and std dev parameters as the second half.
This is useful for factorizations which have a combination of families (e.g., a mean-field family of a Gaussian mixture model has mean-field Gaussian, Dirichlet, and inverse Gamma). It's also useful for variational auto-encoders: the output of the neural network is a vector that we extract variational parameters from.
— Reply to this email directly or view it on GitHub https://github.com/Blei-Lab/blackbox/issues/18#issuecomment-196472505.
What's the right abstraction, and base class methods and members that all variational model classes should share? Further, how do we mix and match them up, so it's not as blocky as "MFGaussian" but can, e.g., be a choice of variational family for each dimension, or specification of a joint distribution.
This choice will be particularly relevant for designing classes for hierarchical variational models, in which you will have some arbitrary stacking of these guys.