Closed MaisaDaoud closed 4 years ago
Thank you for your answer! but based on this implementation, what is the role of the z-priors? I guess the implementation should be like:
approx_posterior = encoder(features) approx_posterior_sample = approx_posterior.sample(params["n_samples"]) code = make_prior() decoder_likelihood = decoder(code) So the Kl-Divergence function will minimize the difference been the code (P(z) ) and the posterior Q(z|x) samples and, by this way, the decoder will be able to generate new samples from the random distribution defined in the make_prior method. Sorry for the inconvenience, but reading and trying different implementations made it a bit confusing
No worries, even reading about the same thing in different conventions can be a hassle, and things getting weirder as you throw multiple implementations... is just normal. I suggest taking a look at the following function from the beginning,
It deals with latent prior that you seem to be expecting but not finding. I don't know what you are referring to as other implementations. But taking time to check that out and resolving it with some other implementation/presentation in your hand will probably demystify things.
Edit: I guess other common implementation makes the transformation on reparameterization trick explicit, whereas the implementation in TFP doesn't make this as "explicit"?
other implementation is this one for example : https://danijar.com/building-variational-auto-encoders-in-tensorflow/. There are many others as well.
Anyways, thank you very much for your time and answers
The example implementation is similar to what's implemented in this repo (i.e. what you mentioned as code
in the link is really what approx_posterior.sample(params["n_samples"])
is doing). Sorry if I ended up confusing you more, but I am sure you will able to resolve it.
Ok thanks, I will have another look! thanks for your time
I find the question somewhat vague but I will give it a go in answering. First I assume you are talking about this implementation within the repo
tensorflow_probability/examples/vae.py
.You are correct in stating that input to the decoders should be the z-priors sampled from posterior
q(Z | X)
. I think you are just confused about the terminology. The encoder itself is referred to asapprox_posteriors
because the distribution itself is approximated, due to difficulty in computing the exactq(Z|X)
. I think this is a good reference that explains the general settings of Variational Inference. Section 2.1 illustrates why the problem is intractable, and Section 2.2 gives you a sense of how the problem is an optimization/approximation.Anyways,
suggests that
approx_posterior
is yielding samples, which are z-priors that you are talking about. So the implementation is in agreement with your understanding of VAE.