Closed doctorwes closed 5 years ago
Thank you @doctorwes for your report.
We have detected a bug where, in the code that you are using, a false dependency in the qmodel
is detected between x
and y
.
We are debugging this issue and we will keep you updated with any solution and bugfix release that solves this problem.
We fixed the problem in 0adb9dfc86f7bc72271e453a59b7522ded1b96d6, and it has been included in release 1.2.3
, which is already available (pip install inferpy==1.2.3
).
Regarding to your model, the following changes could be made:
Please tell us if this solves your problem to close this issue in the end.
Thank you! Inference now seems to work; here is the actual model I'm using, modified according to your advice. However, conditional sampling is not giving me the results I expect. In the training data, there are correlations between the columns of x_train
and y_train
.
@inf.probmodel
def vae(k, d0, dx, dy, decoder):
with inf.datamodel():
z = inf.Normal(tf.ones(k), 1, name="z")
output = decoder(z, d0, dx+dy)
x_loc = output[:, :dx]
x_scale = tf.nn.softmax(output[:, dx:2*dx])
y_loc = output[:, 2*dx:2*dx+dy]
y_scale = tf.nn.softmax(output[:, 2*dx+dy:])
x = inf.Normal(x_loc, x_scale, name="x")
y = inf.Normal(y_loc, y_scale, name="y")
# Neural networks for decoding and encoding
def decoder(z, d0, d):
h0 = tf.keras.layers.Dense(d0, activation=tf.nn.relu)
h1 = tf.keras.layers.Dense(2*d)
return h1(h0(z))
def encoder(x, y, d0, k):
hm = tf.keras.layers.Concatenate(axis=1)
h0 = tf.keras.layers.Dense(d0, activation=tf.nn.relu)
h1 = tf.keras.layers.Dense(2*k)
return h1(h0(hm([x, y])))
# Q model for making inference
@inf.probmodel
def qmodel(k, d0, dx, dy, encoder):
with inf.datamodel():
x = inf.Normal(tf.ones(dx), 1, name="x")
y = inf.Normal(tf.ones(dy), 1, name="y")
output = encoder(x, y, d0, k)
qz_loc = output[:, :k]
qz_scale = tf.nn.softplus(output[:, k:]) + scale_epsilon
qz = inf.Normal(qz_loc, qz_scale, name="z")
# number of components
k = 12
# size of the hidden layer in the NN
d0 = 100
# dimensionality of the data
dx = 4
dy = 2
# number of observations (dataset size)
N = 240
# batch size
M = 12
# minimum scale
scale_epsilon = 0.01
# inference parameters
learning_rate = 0.01
m = vae(k, d0, dx, dy, decoder)
q = qmodel(k, d0, dx, dy, encoder)
# set the inference algorithm
VI = inf.inference.VI(q, epochs=10000)
m.fit({"x": x_train, "y": y_train}, VI)
Unconditional sampling recovers the expected correlations between the columns of sampled x
and y
.
uncond_draws = m.posterior_predictive().sample(1000)
However, when I attempt conditional sampling in the following way (i.e. when I try to sample conditional on predetermined values of y
), the columns of sampled x
and of y_data
are pretty much uncorrelated.
cond_draws = m.posterior_predictive(data={'y':y_obs}).sample(1000)
I'm not sure whether this is because my model is mis-specified, or because I'm defining/invoking the Query
object in the wrong way.
Anyway, many thanks for your assistance - your response was remarkably rapid!
I would recommend you to simplify your model by not modelling the variance/scale of the Normal distributions of 'x' and 'y'. Simply learn the means and keep the variances to a constant (e.g. scale=1.0).
This is an issue with VAEs using Gaussian observation distributions. The model does not know whether to increase variance or move the mean to capture the data, and it can take a while until convergence. This is because many people use a Binomial observation distribution when building VAEs.
Regarding to the issue with the uncorrelated samples of x
and y
. The problem is that, in your P model, x
and y are independent given z
. This means that fixing the values of y
will not influence the samples of x
(and viceversa).
Moreover, consider the way posterior_predictive
works: samples are generated from a P model where global hidden variables and NN parameters are fixed to those inferred. Variables at input parameter data
might be fixed as well. In your model, the decoder parameters and value of y
are fixed. Samples are generated from parents to children, that is: first we sample from z
, then those samples are passed through the NN and finally samples from x are generated. As you see the value set to y
does not affect to z and hence to z
.
In conclusion, you might need to consider an alternative model.
Thank you for your advice! Using an alternative model with a dependency of x
on y
seems to solve the problem.
I'm trying to implement a VAE with two input vectors rather than one. The purpose is to be able to sample the first vector conditional on known values of the second. My first attempt was the following:
This generates the following error:
I can eliminate the error by reshaping
y
in theencoder
function:However, I then get the following error:
I get the same error when I try SVI, unless I specify
batch_size=1
, in which case it works.