blei-lab / edward

A probabilistic programming language in TensorFlow. Deep generative models, variational inference.
http://edwardlib.org
Other
4.83k stars 759 forks source link

Mixture of Bernoullis #686

Open nfoti opened 7 years ago

nfoti commented 7 years ago

I'm trying to implement a mixture of Bernoulli distributions and am running into trouble adapting the mixture of Gaussians example. I've included a minimum working example below to illustrate the problem (and two of my attempts at specifying the model). Specifically, I get the exception ValueError: Shapes () and (2,) are not compatible. Any help is greatly appreciated.

Thanks.

import edward as ed
import tensorflow as tf

from edward.models import Categorical, Bernoulli, Mixture

N = 15

#cat = Categorical(probs=tf.stack([[0.3, 0.7]]*N, axis=0))
#comps = [Bernoulli(probs=tf.stack([[0.1, 0.9]]*N, axis=0),
#Bernoulli(probs=tf.stack([[0.9, 0.1]]*N, axis=0)]

cat = Categorical(probs=[0.3, 0.7], sample_shape=N)
comps = [Bernoulli(probs=[0.1, 0.9], sample_shape=N),
Bernoulli(probs=[0.9, 0.1], sample_shape=N)]

mix = Mixture(cat=cat, components=comps)
romain-lopez commented 7 years ago

I'm running into the same trouble, I'll let you know when I found something. Maybe your problem is solved already ? Thanks !

EDIT: So as far as I understood, you have to get the event_shapes AND the batch shapes to match. This is not the case as you wrote it, you should add brackets and make sure of what a Bernouilli variable is (defined only by probability of success). Therefore, the following code will work:

import edward as ed
import tensorflow as tf

from edward.models import Categorical, Bernoulli, Mixture

N = 15

cat = Categorical(probs=[[0.3, 0.7]], sample_shape=N)
comps = [Bernoulli(probs=[0.1], sample_shape=N),
Bernoulli(probs=[0.9], sample_shape=N)]

mix = Mixture(cat=cat, components=comps)

Thanks for giving me an example to work on, I also solved my problem that was similar !

dustinvtran commented 7 years ago

Thanks @romain-lopez for answering. To confirm, from the TensorFlow source code (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/distributions/python/ops/mixture.py#L131), the Categorical distribution indeed has to have the same batch shape as each component.

nfoti commented 7 years ago

Sorry for the slow reply, I had been traveling and wasn't able to respond. Thank you for the advice, however, my problem is still not solved.

In my example above I am trying to get a mixture of Bernoullis where each observation is in {0,1}^2. The probabilities I was specifying to the Bernoulli were not success and failure probabilities, but the probability that entries 1 and 2 were successes, respectively.

I can successfully make a mixture of Gaussians using code from the edward tutorials. However, when I adapt it to use Bernoullis as component distributions I get an error about the shapes between the categorical and components not matching. Specifically, it seems that the number of categories for the Categorical distribution are not being recognized by the Bernoulli (but it seems to work fine with MultivariateNormalDiag as the component distributions with the same shape as the Bernoullis I'm working with).

For now I have a tacky workaround that specifies a mixture of Gaussians and passes that random variable as the logic of a Bernoulli. I think this is equivalent, but unsettling that I can't specify the components that I want.

dustinvtran commented 7 years ago

MultivariateNormalDiag works because the batch shape is compatible with cat's. The only dimension that changes is the event dimension.

We can do a similar thing for Bernoulli by creating a MultivariateBernoulli random variable. Below I swapped bernoulli.py's implementation of batch_shape and event_shape. Then I changed sample_n to use event_shape_tensor. AFAIK log_prob does not need to change.

import edward as ed
import tensorflow as tf

from edward.models import Categorical, Bernoulli, Mixture

class MultivariateBernoulli(Bernoulli):
  """Multivariate Bernoulli where batch shape is always a scalar and
  event shape is determined by shape of parameters."""
  def _batch_shape_tensor(self):
    return tf.constant([], dtype=dtypes.int32)

  def _batch_shape(self):
    return tf.TensorShape([])

  def _event_shape_tensor(self):
    return tf.shape(self._logits)

  def _event_shape(self):
    return self._logits.get_shape()

  def _sample_n(self, n, seed=None):
    new_shape = tf.concat([[n], self.event_shape_tensor()], 0)
    uniform = tf.random_uniform(
        new_shape, seed=seed, dtype=self.probs.dtype)
    sample = tf.less(uniform, self.probs)
    return tf.cast(sample, self.dtype)

N = 15

cat = Categorical(probs=[0.3, 0.7], sample_shape=N)
comps = [MultivariateBernoulli(probs=[0.1, 0.2], sample_shape=N),
         MultivariateBernoulli(probs=[0.3, 0.6], sample_shape=N)]

mix = Mixture(cat=cat, components=comps)
print(mix)
print("Shape = Sample + Batch + Event: {}".format(mix.shape))
print("Sample shape: {}".format(mix.sample_shape))
print("Batch shape: {}".format(mix.batch_shape))
print("Event shape: {}".format(mix.event_shape))
## Shape = Sample + Batch + Event: (2,)
## Sample shape: ()
## Batch shape: ()
## Event shape: (2,)

This is messy. I raised an issue on TensorFlow to get their thoughts (https://github.com/tensorflow/tensorflow/issues/11309).

nfoti commented 7 years ago

Thanks @dustinvtran. I really appreciate the help.