Variational EM for mixtures of Gaussians throws an AttributeError

mbesserve commented 7 years ago

I am trying to implement variational inference for linear combinations of mixtures of Gaussians in the flavor of Attias, H. "Independent Factor Analysis." Neural Computation 11.4 (1999): 803-851.

In the following I am just showing a 1D example where I am essentially trying to implement a variational EM algorithm with an E step on the categorical latent variables and an M step on the mixture parameters. I took inspiration from the http://edwardlib.org/api/inference-compositionality , and cannot figure out what is missing.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import edward as ed
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
import six
import tensorflow as tf

from edward.models import Categorical, InverseGamma, Mixture, \
    MultivariateNormalDiag, Normal
from edward.models import PointMass
from edward.models import Dirichlet
from tensorflow.contrib import slim

plt.style.use('ggplot')

def build_toy_dataset(N):
  pi = [[.4,.6]]
  mus = [[1., -1.]]
  stds = [[.1, .1]]
  x = np.zeros((N, 1), dtype=np.float32)
  for n in range(N):
      for kcomp in range(1):
          k = np.argmax(np.random.multinomial(1, pi[kcomp]))
          x[n, kcomp] = np.random.normal(mus[kcomp][k], stds[kcomp][k])
  return x

N = 500  # number of data points
D = 1  # dimensionality of data
numMod = [2]
#ed.set_seed(42)

# DATA
x_train = build_toy_dataset(N)

plt.hist(x_train[:,0])
plt.show()

mu = Normal(mu=tf.zeros([numMod[0], 1]), sigma=tf.ones([numMod[0], 1]))
sigma = InverseGamma(alpha=tf.ones([numMod[0], 1]), beta=tf.ones([numMod[0], 1]))

betaPar0 = (tf.ones(numMod[0],1))
catVar0 = Categorical(logits=tf.ones([N, 1])*betaPar0)
components0 = [MultivariateNormalDiag(mu=tf.ones([N, 1]) * tf.gather(mu,k),
                           diag_stdev=tf.ones([N, 1]) * tf.gather(sigma,k))
            for k in range(numMod[0])]      
x0 = Mixture(cat=catVar0, components=components0)

qz0 = Categorical(logits=tf.Variable(tf.zeros([N, 1])))
qmu = PointMass(params=tf.Variable(tf.zeros([numMod[0], 1])))
qsigma = PointMass(params=tf.Variable(tf.ones([numMod[0], 1])))

inference_e = ed.VariationalInference({catVar0: qz0}, data={x0: x_train, mu: qmu, sigma: qsigma})
inference_m = ed.MAP({mu: qmu, sigma: qsigma}, data={x0: x_train, catVar0: qz0})

for _ in range(10):
    inference_e.update()
    inference_m.update()

Running the update of the E step leads to

AttributeError: 'VariationalInference' object has no attribute 'train'

I might have missed something but as I have now reduced the model to a rather minimal one (univariate, two free parameters), I have no idea how to fix this.

dustinvtran commented 7 years ago

Thanks for asking. VariationalInference is an abstract class. You need to use a subclass of VariationalInference such as KLqp, KLpq, or MAP. Also see http://edwardlib.org/api/inference-classes.

That said, the error message should be more informative; I added an issue to solve this.

mbesserve commented 7 years ago

Hi Dustin. Thanks for the quick reponse. I forgot to mention that I tried that already with KLpq and KLqp and got the same error message.

dustinvtran commented 7 years ago

In the doc for inference hybrids, the ... means that there are intermediate steps left out for brevity. You need to initialize the algorithms before being able to call update(). Also see http://edwardlib.org/api/inference and the Monte Carlo EM example in https://github.com/blei-lab/edward/blob/master/examples/factor_analysis.py.

mbesserve commented 7 years ago

Thanks again. Indeed I missed those .... If I do initialize inference_e (assuming it uses default parameters if I do not input any), it outputs a TypeError I have difficulties to understand (I have had many of those trying to make the inference work)

TypeError: cat must be a Categorical distribution, but saw: Tensor("inference_614459625544/0/Categorical_3/sample/Reshape_1:0", shape=(500,), dtype=int32)

As far as I understand the variables are correctly defined as categorical. I have been trying to implement the inference for such models in several ways (before trying the EM approach) and always ended up an incompatibility of some sort. I apologize for the beginner questions, I will look at the Monte Carlo EM example in detail, maybe I ll find the answer there.

dustinvtran commented 7 years ago

The Mixture random variable sums out the mixture assignments, so you shouldn't be trying to infer them. If you aim to do an variational E-step to infer latent mixture assignments, the observed variables are just the normal distributions. See, e.g., http://edwardlib.org/tutorials/unsupervised for the difference between a mixture of Gaussians and a collapsed mixture of Gaussians.

I apologize for the beginner questions, I though I could use these tools without having to delve into the underlying implementation details but it might just not be realistic.

Not at all! Questions like these are highly encouraged. Whatever's not clear to you is not your mistake but the documentation's.

If you have more questions, feel free to post on the Forum (https://discourse.edwardlib.org). Closing as it isn't a bug.

blei-lab / edward

Variational EM for mixtures of Gaussians throws an AttributeError #580