Open adamgayoso opened 5 years ago
I am thinking about the same question. Seems like original author used similar way to write loss function:https://github.com/y0ast/VAE-TensorFlow.
The code comments for the KL Divergence calculation cite Appendix B of https://arxiv.org/abs/1312.6114, which states that it's meant for the Gaussian case, so if this is intended then BCE (meant for Bernouille) isn't correct and should be replaced with $log p_theta(x | z)$.
The current implementation uses
F.binary_cross_entropy(recon_x, x.view(-1, 784), reduction='sum')
as the reconstruction loss. The image
x
has pixel values in[0,1]
. This is not the same as Bernoulli log likelihood. The images would have to binarized.In Ladder Variational Autoencoders by Sonderby et al, they binarize the images as a Bernoulli sample after each epoch.