cdoersch / vae_tutorial

Caffe code to accompany my Tutorial on Variational Autoencoders
MIT License
499 stars 134 forks source link

KL loss compare to Keras #7

Open samadejacobs opened 6 years ago

samadejacobs commented 6 years ago

Thank you the nice tutorial and supporting code. I made a plot (attached) of KL Loss vs iterations of your implementation and that of Keras (blog, code). Could you please provide insight as to why the KL loss for your implementation is going up? vae_klloss-kerascaffe

cdoersch commented 6 years ago

Not immediately evident based on just the plot--although one big difference is the number of latents. Initialization can also play a role. Note that the KL loss going up simply means that more information is being encoded in the latents; as long as the error is going down more than the KL loss goes up, then the VAE is learning the distribution. In fact, this is often the behavior you want from a VAE, since you expect the VAE to encode more in its latent variables over time.