Open davidlibland opened 7 years ago
tf.gather
returns a tf.Tensor
and not a ed.RandomVariable
. This means x
and x_equiv
are different:
>>> x
<ed.RandomVariable 'Bernoulli/' shape=(30,) dtype=int32>
>>> x_equiv
<tf.Tensor 'Gather:0' shape=(30,) dtype=int32>
As with all TensorFlow ops, tf.gather
is performed on the random variable's associated tensor. This implies the second inference is conditioning on data that doesn't affect the latent variables at all; this is why qp
is shown converging to the prior.
Ideally, we might determine the distribution of certain TensorFlow op outputs such as tf.gather
. For tf.gather
, an open problem is how to determine the individual parameters that make up a batch of random variables. Say, the first Bernoulli random variable's parameter within a vector of them.
@dustinvtran I'm confused by:
Ideally, we might determine the distribution of certain TensorFlow op outputs such as tf.gather. For tf.gather, an open problem is how to determine the individual parameters that make up a batch of random variables. Say, the first Bernoulli random variable's parameter within a vector of them.
In essence tf.gather(A,B)
seems to be a version of a tensor product between A
and some tensor determined by B
(for example, when B=np.arange(N)
, the second tensor is just the identity matrix). So this doesn't seem any more complicated than the implementation of ed.dot
; at least for the case where B
is not stochastic.
I'm not sure if that solves the problem though. ed.dot
returns a tf.Tensor
and not a ed.RandomVariable
. To put it another way, tf.gather
demonstrates the fundamental problem; as you point out, ed.dot
is even more challenging.
@dustinvtran @davidlibland Is there any workaround for tf.gather? For example, this tutorial http://edwardlib.org/api/inference-compositionality seems to be apply tf.gather on a vector of rvs:
beta = Normal(loc=tf.zeros([K, D]), scale=tf.ones([K, D]))
z1 = Categorical(logits=tf.zeros([N1, K]))
z2 = Categorical(logits=tf.zeros([N2, K]))
x1 = Normal(loc=tf.gather(beta, z1), scale=tf.ones([N1, D]))
x2 = Normal(loc=tf.gather(beta, z2), scale=tf.ones([N2, D]))
Is this a valid model? In fact, it is even 'worse' since both arguments in gather are rvs (as opposed to just values).
Related question: does tf.gather behave well if I use MAP inference? In this case I do not expect a difference between rv and its tensor.
In this setting, tf.gather
is fine because these are just ways of parameterizing a distribution. During inference we're not asking about the distribution of the tf.gather
d random variable, but z1
, z2
, and beta
.
When you apply tf.gather to a random variable, edward seems to have trouble treating the result as a random variable. Here's an example: if
W
is a random variable of shape (100), thentf.gather(W,np.arange(100))
should be effectively the same asW
, but substituting one for the other yields different results. Here's a simple example in code:setup:
Now if we run the following inference:
we get the following distribution for
qp
however, if we run
we get