Closed wuaalb closed 9 years ago
Thanks a lot. Just a few comments 1) Have you checket that it still converges with the new KL calculation? 2) In vae_vanilla.py: Can you add a reference to Appendix B in kingma so people can find where the exact KL calculation comes from? 3) I think kl_normal2_stdnormal should be moved to the vae_vanilla.py example since it only applies here?
Thanks for reviewing.
On my machine with the default configuration I get:
With analytic_kl_term = False
*Epoch: 999 Time: 2.70 LR: 0.00500 LL Train: -152.035 LL test: -152.866
With analytic_kl_term = True
*Epoch: 999 Time: 2.59 LR: 0.00500 LL Train: -140.083 LL test: -141.687
I don't mind changing the other stuff you mentioned. However, I think it would be more convenient to have stuff like this in the library itself rather than having to repeat it for every script that uses this kind of model (also helps to make examples smaller).
Its fine to keep the KL term in the library it self - however, can you add a doc string stating that it is the analytical solution to a gaussian with standard normal prior?
Its good that the KL term is better, however the results are not too good though - I think I choose some non-optimal hyperparams :). Can i get you to change the the hidden-layer-size=500, latent_size=100, lr=0.001 and batch_size=128. I these should work better. Then we can merge everything in one go.
thanks for contributing!
Ok, made those changes. Didn't wait for full 1000 epochs yet, but those defaults seem better.
great -thanks a lot.
The LL should get up to around ≈-100ish
I'll merge merge now